Big Data and Analytics Services Flashcards
AWS Certified Data Engineer Associate DEA-C01 Flashcards

| Front | Back |
| Athena | Athena is a serverless query service to analyze data directly in Amazon S3 using SQL. |
| Athena Configuration | Requires creating and connecting to a database/table with data stored in S3. |
| Athena Pricing | Priced per query based on the volume of data scanned in bytes. |
| Athena Use Case | Used for ad-hoc analysis of structured data stored in S3 without the need for ETL processes. |
| EMR | Amazon EMR is used for big data processing and analysis using Apache Spark, Hadoop, and other frameworks. |
| EMR Configuration | Requires cluster setup with instances, instance groups, applications, and job flow steps. |
| EMR Pricing | Priced based on the number and type of EC2 instances in use by the cluster. |
| EMR Use Case | Ideal for running distributed frameworks like Spark and Hadoop for processing large datasets. |
| Kinesis | Amazon Kinesis processes real-time data streams for analytics and applications. |
| Kinesis Configuration | Requires setting up streams, shards, producers, and consumers. |
| Kinesis Pricing | Charged based on the number of shards and data throughput/processing. |
| Kinesis Use Case | Best for real-time video analytics, log processing, IoT data streams, and metric monitoring. |
| QuickSight | Amazon QuickSight is a business intelligence service to visualize data and create dashboards. |
| QuickSight Configuration | Requires data source connections, importing datasets, and defining visualizations. |
| QuickSight Pricing | Charged per user and type of access: Standard or Enterprise Edition. |
| QuickSight Use Case | Perfect for creating interactive visualizations and sharing business reports. |
About the Flashcards
Flashcards for the AWS Certified Data Engineer Associate exam help you review core AWS analytics services-EMR, Athena, Kinesis, and QuickSight-by focusing on terminology, core concepts, and exam-style use cases. Cards explain what each service does (big data processing, serverless SQL querying on S3, real-time stream processing, and BI visualizations) so you can recall service purposes quickly.
The deck also covers practical configuration details and cost models: cluster and instance setup for EMR, databases and table connections for Athena, streams and shards for Kinesis, and data sources and visualizations for QuickSight. Use these flashcards to reinforce key ideas, common exam scenarios, and pricing distinctions.
Topics covered in this flashcard deck:
- EMR cluster configuration
- Athena SQL on S3
- Kinesis streams and shards
- QuickSight dashboards and visuals
- Service pricing models