GCP Professional Data Engineer Practice Question

Your analytics team stores raw log files in several Cloud Storage buckets and curated tables in BigQuery datasets that span four GCP projects. They complain that discovering datasets is difficult and security administrators want a single place to apply and audit access policies and tag sensitive columns. You are asked to implement a solution that automatically keeps metadata current on a nightly schedule without moving data or deploying custom crawlers. What should you do?

  • Create a Dataplex lake spanning the four projects, add the buckets and datasets as assets in appropriate zones, enable automatic discovery with a daily schedule, and manage policy tags and IAM centrally in the Dataplex Catalog.

  • Enable Cloud Asset Inventory across the projects and export the inventory to BigQuery; schedule a Cloud Function to update column-level tags each night.

  • Build a nightly Dataflow job that reads object metadata and BigQuery INFORMATION_SCHEMA views, writes the results to a central BigQuery table, and controls access through BigQuery-level IAM.

  • Deploy an Apache Atlas cluster on GKE to crawl the buckets and datasets nightly and expose the collected metadata through its REST API.

GCP Professional Data Engineer
Storing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot