🔥 40% Off Crucial Exams Memberships — This Week Only

3 days, 13 hours remaining!

AWS Certified Data Engineer Associate DEA-C01 Practice Question

An analytics team stores 2 TB of GZIP-compressed JSON click-stream data in Amazon S3. The data engineer needs a web notebook to interactively run PySpark, inspect samples, and write partitioned Parquet back to S3, without managing clusters and paying only when code executes. Which AWS approach satisfies these goals?

  • Launch an Amazon EMR cluster with Apache Zeppelin notebooks and configure the cluster to auto-terminate when idle.

  • Create an Athena for Apache Spark notebook and use the interactive session to transform and write the data.

  • Use AWS Glue DataBrew to profile the JSON files and export the results as Parquet.

  • Build an on-demand AWS Glue PySpark job that is started by AWS Step Functions when transformation is required.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot