GCP Professional Data Engineer Practice Question

Your analytics team processes 3 PB of application logs that arrive via Pub/Sub, and a Dataflow pipeline writes the raw events as Avro files. Compliance rules mandate that each file be retained unaltered for seven years, yet engineers access the logs only a few times after the first week. Analysts must be able to run occasional ad-hoc SQL queries from BigQuery without first loading the data into native tables. Which Google Cloud storage approach will minimize total cost while satisfying durability, retention, and query needs?

  • Load the Avro files into partitioned BigQuery tables and rely on BigQuery's long-term storage pricing to reduce costs.

  • Persist the logs in a regional Cloud Bigtable cluster and export snapshots to Cloud Storage once per quarter.

  • Write the logs to an AlloyDB for PostgreSQL instance using compressed columnar storage and grant BigQuery federated access.

  • Keep the Avro files in a Cloud Storage bucket configured with the Archive storage class via lifecycle rules, and define BigQuery external tables to query them in place.

GCP Professional Data Engineer
Storing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot