AWS Certified Data Engineer Associate DEA-C01 Practice Question
A data engineering team loads a daily fact file into an internal Amazon Redshift table. The upstream system will start delivering Parquet files in Amazon S3 and might add extra columns at any time. The team must make the new data queryable in Redshift without manual schema updates or table recreation, while keeping performance high and storage costs low. Which solution meets these requirements?
Run an ALTER TABLE statement to add every new column after it appears, then reload the data into a rebuilt internal table created with UNLOAD and COPY.
Transform each Parquet file back to CSV with an AWS Glue job, then load the result into the internal table with COPY.
Continue using the COPY command into the existing internal table and specify IGNOREHEADER to skip any new columns that are added.
Create a Redshift Spectrum external table that points to the Parquet files in S3 and schedule an AWS Glue crawler to update the Data Catalog table when new columns appear.
Creating an Amazon Redshift Spectrum external table lets Redshift query the Parquet files in Amazon S3 without copying the data into cluster storage, minimizing cost. Configuring an AWS Glue crawler to run after each delivery and to update the existing table definition allows new columns in the Parquet files to be added automatically to the Data Catalog. Because Redshift Spectrum reads its metadata from the Data Catalog, the additional columns become immediately available for queries without manual ALTER TABLE statements. The other options either require ongoing manual DDL changes, convert Parquet back to CSV (losing columnar performance and adding processing cost), or attempt to ignore new columns, which COPY does not support.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Redshift Spectrum and how does it work?
Open an interactive chat with Bash
What role does an AWS Glue crawler play in this solution?
Open an interactive chat with Bash
Why is a Parquet file format recommended in this solution?
Open an interactive chat with Bash
What is Amazon Redshift Spectrum?
Open an interactive chat with Bash
What role does an AWS Glue crawler play in this solution?
Open an interactive chat with Bash
Why is Parquet preferred over CSV for this use case?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .