AWS Certified Data Engineer Associate DEA-C01 Practice Question
An Amazon Athena query joins a large, 1 TB Parquet table named sales with a small, 50 MB CSV table named regions. The sales table is partitioned by order_date. The query must return last year's sales for the 'APAC' region but currently scans the entire sales table. Which SQL rewrite will most effectively reduce the amount of data scanned while returning the same result?
WITH filtered_sales AS ( SELECT * FROM sales WHERE order_date BETWEEN DATE '2023-01-01' AND DATE '2023-12-31' ) SELECT s.*, r.region_name FROM filtered_sales s JOIN regions r ON s.region_id = r.id WHERE r.region_name = 'APAC';
SELECT s.*, r.region_name FROM sales s JOIN regions r ON s.region_id = r.id WHERE r.region_name = 'APAC' AND s.order_date BETWEEN DATE '2023-01-01' AND DATE '2023-12-31';
SELECT r.region_name, s.* FROM regions r JOIN sales s ON s.region_id = r.id WHERE r.region_name = 'APAC' AND s.order_date BETWEEN DATE '2023-01-01' AND DATE '2023-12-31';
SELECT * FROM sales s WHERE order_date BETWEEN DATE '2023-01-01' AND DATE '2023-12-31' AND EXISTS (SELECT 1 FROM regions r WHERE r.id = s.region_id AND r.region_name = 'APAC');
Athena charges by the amount of data scanned. The most effective way to reduce cost and improve performance is to filter the large, partitioned table before joining it. By using a Common Table Expression (CTE) or subquery to filter the sales table by the order_date partition key, the query can take advantage of partition pruning, dramatically reducing the amount of data scanned from Amazon S3. The small regions table is then joined to this much smaller, pre-filtered dataset. Applying filters after the join or filtering only the small table does not guarantee that partition pruning on the large table occurs before the join, leading to higher scan sizes and costs.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is partition pruning in Amazon Athena?
Open an interactive chat with Bash
Why should filtering be applied before a join in SQL queries?
Open an interactive chat with Bash
What are the advantages of using Parquet files with Amazon Athena?
Open an interactive chat with Bash
Why is partition pruning important in Amazon Athena?
Open an interactive chat with Bash
What is the role of a Common Table Expression (CTE) in this Athena query?
Open an interactive chat with Bash
How does data format affect query performance in Athena?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .