CompTIA DataX DY0-001 (V1) Practice Question

A data architect at a major e-commerce company is designing an ingestion and storage solution for a new analytics platform. The platform will process high-velocity user clickstream data, which arrives as semi-structured JSON objects. The primary requirements are to support fast, complex analytical queries on specific columns while minimizing storage costs and providing data that is refreshed every few minutes. Which of the following approaches best meets all of these requirements?

Stream the incoming JSON data directly into a structured, relational database, normalizing the data into multiple tables.
Set up a daily batch process to collect all clickstream events, flatten them, and store them as compressed CSV files.
Implement a real-time streaming pipeline that writes the raw, nested JSON data directly to object storage as individual files.
Ingest the data in micro-batches, converting the nested JSON into a flattened, columnar Parquet format for storage.

CompTIA DataX DY0-001 (V1)

Operations and Processes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why is Parquet better than JSON for analytical queries?

What is the 'small file problem' in data lakes?

What is micro-batching, and how does it differ from real-time streaming?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

Why is Parquet better than JSON for analytical queries?

What is the 'small file problem' in data lakes?

What is micro-batching, and how does it differ from real-time streaming?

Report Issue