AWS Certified Data Engineer Associate DEA-C01 Practice Question
An analytics team runs ad-hoc queries from an Amazon Redshift cluster against 15 TB of application logs in Amazon S3 by using Redshift Spectrum. The logs are gzip-compressed CSV files stored under a single prefix. Queries are slow and incur high data-scanned charges. The team cannot load the data into Redshift but can transform the S3 data once. Which change will most effectively improve performance and reduce Spectrum cost?
Enable Amazon Redshift Concurrency Scaling and increase the number of query slots in the WLM configuration.
Re-compress the CSV files with bzip2 to achieve a higher compression ratio before running Spectrum queries.
Create an Amazon Redshift materialized view that references the existing CSV files through a Spectrum manifest file.
Convert the log files to Parquet and partition the dataset (for example by date), then recreate the external table to reference the new location.
Redshift Spectrum charges are based on the number of bytes scanned. Transforming the dataset to a columnar, splittable format such as Parquet and adding partitions (for example, by date) allows Spectrum to read only the columns and partition folders required by each query. This greatly reduces the data scanned and improves query speed. Concurrency Scaling or WLM changes do not lessen the bytes Spectrum reads. Creating a materialized view over the same CSV data still scans every row, and switching from gzip to another row-based compression does not provide the column pruning and partition elimination benefits that Parquet offers.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Redshift Spectrum and how does it work?
Open an interactive chat with Bash
What are Parquet files and why are they better for Redshift Spectrum?
Open an interactive chat with Bash
What are data partitions and how do they improve query performance?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .