AWS Certified Data Engineer Associate DEA-C01 Practice Question
An e-commerce company runs a nightly AWS Glue Spark ETL job that copies the previous day's CSV files from the s3://staging/orders prefix into partitioned Parquet files in s3://curated/orders. If the job is restarted or rerun manually, it must not ingest objects that were already processed, so that duplicate records are never written. Which Glue job configuration will meet this requirement without adding any custom filtering logic?
Configure the job to write to an interim S3 location and then move the data to the curated bucket after completion.
Increase the spark.executor.instances property to create additional executors for the job.
Turn on speculative execution for the job's Spark tasks.
Enable job bookmarks on the Glue job by setting the job parameter --job-bookmark-option to job-bookmark-enable.
AWS Glue job bookmarks persist state about which source files or partitions have been processed. When bookmarks are enabled, a rerun of the same job automatically skips data that was handled in previous successful runs, preventing duplicates during batch ingestion. Increasing Spark executors, enabling speculative execution, or using a temporary output location can improve performance or reliability but do not provide automatic deduplication of already-processed source files.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What are AWS Glue job bookmarks?
Open an interactive chat with Bash
How do you enable job bookmarks in an AWS Glue job?
Open an interactive chat with Bash
What are the limitations of AWS Glue job bookmarks?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .