AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company stores click-stream events in an Amazon Kinesis Data Stream and transactional data in an Amazon RDS for MySQL database. A data engineer must join the two datasets every 5 minutes, apply a currency conversion, and write partitioned Parquet files to Amazon S3 for subsequent Amazon Redshift COPY operations. Which approach provides the required integration with the least operational overhead?
Create an AWS Glue streaming ETL job that reads from the Kinesis Data Stream and the MySQL table through a JDBC connection, performs the join and conversion, and writes partitioned Parquet files to S3.
Configure AWS Database Migration Service to continuously replicate data from both sources into Amazon Redshift with transformation rules, and unload the merged table to S3.
Use Amazon Redshift external tables over the Kinesis stream and the MySQL database, then schedule a Redshift query to join and unload the data to S3.
Schedule an Amazon Athena federated query that reads the MySQL table and click-stream data staged in S3, performs the join, and writes the results back to S3.
An AWS Glue streaming ETL job can read from multiple sources in the same Spark application. By using a JDBC connection it can pull records from the RDS table, simultaneously consume events from the Kinesis stream, perform the join and currency conversion in memory, and write partitioned Parquet files directly to S3. Glue is serverless, so no cluster provisioning or patching is required, and the streaming job naturally meets the 5-minute latency goal.
Amazon Redshift can query data directly from a Kinesis stream and an RDS database (using federated queries), but it is primarily an analytical data warehouse, not an ETL service. Using it for continuous transformations every 5 minutes would be operationally more complex and less efficient than a dedicated Glue job. AWS Database Migration Service supports ongoing replication from RDS but cannot ingest from Kinesis and offers only limited transformation capabilities. Amazon Athena federated queries would require the click-stream data to first be staged to S3 and would also need separate orchestration to achieve near-real-time joins, creating a higher operational burden than a single Glue job.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is AWS Glue?
Open an interactive chat with Bash
How does AWS Glue connect to an RDS database using JDBC?
Open an interactive chat with Bash
Why are partitioned Parquet files used in S3 and how do they help with Amazon Redshift COPY operations?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .