GCP Professional Data Engineer Practice Question

Your analytics platform stores several years of click-stream data in Parquet files on Cloud Storage. Data scientists query the most recent partitions interactively through BigQuery, but new compliance rules require user-specific row-level filters to be enforced on both the historic data in Cloud Storage and the fact tables already ingested into BigQuery. Engineering stipulates that you must:

  • Avoid copying or re-loading the Parquet data into new BigQuery tables.
  • Maintain a single, consistent security policy that governs data in both the warehouse and the lake.
  • Preserve the ability for future Spark jobs on Dataproc to read the same Parquet files directly. Which approach best meets these requirements while keeping operational overhead low?
  • Provide signed URLs protected by VPC Service Controls and instruct analysts to query the Parquet files with federated queries.

  • Create BigLake tables on the Parquet files and attach BigQuery row-level access policies so the same security model applies to both BigLake and existing native tables.

  • Load the Parquet partitions into new BigQuery managed tables and apply dataset-level IAM roles to enforce access controls.

  • Define BigQuery external tables on the Parquet objects and rely on Cloud Storage bucket-level IAM to restrict access to sensitive rows.

GCP Professional Data Engineer
Storing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot