AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company keeps 5 years of point-of-sale data as partitioned Parquet files in an S3 data lake. Business analysts want to run SQL queries from Amazon Redshift and expect to add or remove columns in the source dataset several times a month. The data engineering team must avoid rewriting historical files or re-loading large Redshift tables each time the schema changes while still benefiting from columnar storage and compression. Which approach meets these requirements with the least operational effort?
Create an external schema in Amazon Redshift Spectrum that points to the Parquet files. Use an AWS Glue crawler to update the external table when columns change.
Convert the Parquet data to CSV and ingest it into Amazon Aurora PostgreSQL; apply ALTER TABLE statements when columns are added or removed.
COPY the Parquet data into a Redshift managed table with ENCODE AUTO and run ALTER TABLE ADD or DROP COLUMN whenever the schema evolves.
Load each day's Parquet files into a DynamoDB table that stores every record as a JSON document so new attributes can be added on demand.
Using Amazon Redshift Spectrum allows Redshift to query Parquet data that remains in Amazon S3. External table definitions are stored in the AWS Glue Data Catalog; a crawler or ALTER TABLE statement can add or drop columns without touching the existing Parquet files. Because the files are already columnar and compressed, no additional optimization work is required. Loading the data into a Redshift managed table or RDS would force a reload when columns change, and DynamoDB is not suited for large analytical scans. Therefore, creating a Spectrum external schema that references the Parquet location and keeping the schema in Glue provides the simplest, most scalable solution.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon Redshift Spectrum?
Open an interactive chat with Bash
What does the AWS Glue crawler do?
Open an interactive chat with Bash
What are the benefits of using Parquet files for data storage?
Open an interactive chat with Bash
What is Amazon Redshift Spectrum?
Open an interactive chat with Bash
What is an AWS Glue crawler?
Open an interactive chat with Bash
Why is Parquet a preferred file format for S3 data lakes?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .