Your analytics platform stores several years of click-stream data in Parquet files on Cloud Storage. Data scientists query the most recent partitions interactively through BigQuery, but new compliance rules require user-specific row-level filters to be enforced on both the historic data in Cloud Storage and the fact tables already ingested into BigQuery. Engineering stipulates that you must:
Avoid copying or re-loading the Parquet data into new BigQuery tables.
Maintain a single, consistent security policy that governs data in both the warehouse and the lake.
Preserve the ability for future Spark jobs on Dataproc to read the same Parquet files directly. Which approach best meets these requirements while keeping operational overhead low?
Provide signed URLs protected by VPC Service Controls and instruct analysts to query the Parquet files with federated queries.
Create BigLake tables on the Parquet files and attach BigQuery row-level access policies so the same security model applies to both BigLake and existing native tables.
Load the Parquet partitions into new BigQuery managed tables and apply dataset-level IAM roles to enforce access controls.
Define BigQuery external tables on the Parquet objects and rely on Cloud Storage bucket-level IAM to restrict access to sensitive rows.
Defining BigLake tables on the Parquet objects lets BigQuery treat data in Cloud Storage as first-class tables and apply the same row-level (and column-level) security policies used for native BigQuery tables. This yields uniform fine-grained governance without duplicating data and still allows other engines, such as Dataproc Spark, to read the underlying files directly. Standard external tables on Cloud Storage do not support BigQuery row-level security, loading the data into new managed tables would violate the no-copy requirement, and Cloud Storage IAM or signed URLs cannot express SQL-based row filters.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What are BigLake tables in GCP?
Open an interactive chat with Bash
How do BigQuery row-level access policies work?
Open an interactive chat with Bash
Why are Parquet files popular for storing analytics data?
Open an interactive chat with Bash
What is a BigLake table?
Open an interactive chat with Bash
What are row-level access policies in BigQuery?
Open an interactive chat with Bash
Why is Parquet used for click-stream data storage?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .