AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company needs to move about 100 GB of new sales records from its Amazon Redshift cluster to an Amazon S3 data lake every night at 01:00. The files must be stored as partitioned Parquet, and previously exported rows must not be included in subsequent runs. The solution should be fully managed, low-cost, and require as little custom code as possible. Which approach meets these requirements?
Configure an Amazon EventBridge rule that triggers an AWS Lambda function at 01:00; the function uses the Redshift Data API to run SELECT queries and streams the results to Amazon S3.
Use Redshift Spectrum to create an external table that points to an S3 location, then run a nightly CTAS command to export the sales data into Parquet files in that location.
Set up an AWS DMS task with Amazon Redshift as the source and Amazon S3 as the target, enable change data capture, and run the task on a nightly schedule.
Create an AWS Glue ETL job that reads the Redshift table through a JDBC connection, enable job bookmarks on the sales_date column, write the output as partitioned Parquet to Amazon S3, and schedule the job with an AWS Glue workflow.
An AWS Glue Spark job can connect to Amazon Redshift by using a Glue connection, convert the result set to a DynamicFrame, and write it to Amazon S3 in Apache Parquet format with date-based partitions. Enabling job bookmarks causes Glue to persist a high-water-mark on the bookmark key (for example sales_date or last_updated), so subsequent runs fetch only new rows. Glue triggers or workflows can schedule the job at 01:00 without external orchestration or server maintenance, meeting the fully managed and low-code requirement.
A DMS task cannot use Amazon Redshift as a source endpoint, so it cannot perform either full or incremental loads from the cluster. A Lambda function that calls the Redshift Data API would need custom pagination, Parquet conversion, and multipart uploads and is constrained by the Lambda 15-minute timeout and memory limits for 100 GB. Although CREATE EXTERNAL TABLE AS SELECT (CTAS) can export Redshift data to S3, the command requires the target S3 path to be empty, so each run would need additional logic to create a new path or drop and recreate the external table, making it more operationally complex than the Glue solution.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What are AWS Glue job bookmarks?
Open an interactive chat with Bash
Why is AWS Glue better suited for this task compared to AWS DMS?
Open an interactive chat with Bash
Why are Lambda functions less suitable for exporting large datasets to Amazon S3?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .