AWS Certified Data Engineer Associate DEA-C01 Practice Question
Your analytics team queries click-stream events that are written as Parquet files to an Amazon S3 data lake. An AWS Glue Data Catalog table and an Amazon Redshift Spectrum external table reference the dataset, which is partitioned by year, month, and day. A new business requirement adds the string column user_country to every new event record. Historical Parquet files will not be backfilled. You must expose the new column to analysts without interrupting existing workloads, and older partitions should continue to return NULL for the column. Which action will meet these requirements with the least disruption?
Run ALTER TABLE <external_table> REPLACE COLUMNS (...) specifying the full updated column list so that the table definition is replaced.
Create a new Glue table and Spectrum external table that include the user_country column, and instruct analysts to switch their queries to the new tables.
Issue ALTER TABLE <external_table> ADD COLUMN user_country varchar(2); from Amazon Redshift, allowing Redshift Spectrum to update the Glue table and return NULL for the column in older partitions.
Drop and recreate the Glue and Redshift Spectrum tables with the new schema after all historical Parquet files are backfilled.
Amazon Redshift Spectrum supports schema evolution for external tables. Executing ALTER TABLE <external_table> ADD COLUMN user_country varchar(2); updates the Glue Data Catalog schema in place. Because Redshift Spectrum reads Parquet files using column names, partitions whose files lack the new column simply return NULL, so historical data continues to query successfully. Dropping and recreating the table, building a second table, or using ALTER TABLE … REPLACE COLUMNS would all require downtime or force analysts to change queries, and REPLACE COLUMNS is risky because it rebuilds the entire column list and can break dependent objects.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why does Redshift Spectrum allow NULL values for new columns in older partitions?
Open an interactive chat with Bash
What is the risk of using `ALTER TABLE ... REPLACE COLUMNS` instead of `ADD COLUMN`?
Open an interactive chat with Bash
How does Redshift Spectrum handle schema evolution for Parquet files?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .