A multinational retailer wants to modernize its on-premises analytics stack by moving to Google Cloud. Requirements are:
Land raw click-stream and IoT data (JSON, images) with virtually unlimited scale.
Provide sub-minute dashboards fed by a streaming pipeline.
Run monthly company-wide SQL analytics without managing infrastructure.
Enforce centralized security, data quality, and lineage controls, while letting regional business units own their datasets. Which high-level design best satisfies these goals with minimal operational overhead?
Write raw events into Bigtable, schedule Cloud Composer DAGs to copy data into BigQuery, and use Data Catalog in each project for discovery and policy control.
Stream JSON payloads straight into a single BigQuery dataset, store images as BASE64 strings, and manage access with manual dataset-level ACLs across regions.
Ingest all data directly into a global Spanner database to serve both real-time dashboards and analytical SQL queries, enforcing governance through IAM on Spanner tables.
Land raw data in Cloud Storage, register it in Dataplex raw zones, process streams with Dataflow into curated BigQuery tables that are also governed by Dataplex.
Cloud Storage gives virtually unlimited, inexpensive object storage for raw semi-structured and unstructured data. Dataplex builds a logical lake and zones on top of Cloud Storage and BigQuery, auto-catalogs assets, applies fine-grained IAM and data quality rules, and supports a federated "data-as-a-product" model without forcing all data into a single project. Dataflow's serverless streaming runners can transform and continuously load data into BigQuery, whose serverless architecture supports interactive SQL analytics at scale without cluster management. Alternatives fall short: storing raw data in Bigtable or Spanner is ill-suited to images and large, variable-schema files; Cloud Composer copy jobs introduce higher latency and maintenance; relying on per-project Data Catalog entries or manual BigQuery ACLs does not meet the centralized, cross-domain governance requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Dataplex in Google Cloud, and how does it manage data governance?
Open an interactive chat with Bash
How does Dataflow enable real-time data processing in Google Cloud?
Open an interactive chat with Bash
Why is Cloud Storage a better option for storing raw click-stream and IoT data compared to Bigtable or Spanner?
Open an interactive chat with Bash
What is Dataplex and how does it support data governance?
Open an interactive chat with Bash
Why is Cloud Storage better for raw JSON and images compared to Bigtable or Spanner?
Open an interactive chat with Bash
How does Dataflow enable real-time streaming pipelines for BigQuery dashboards?
Open an interactive chat with Bash
What is Dataplex, and how does it help with data governance?
Open an interactive chat with Bash
Why is Cloud Storage suitable for raw click-stream and IoT data?
Open an interactive chat with Bash
How does Dataflow support sub-minute dashboards and streaming pipelines?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .