A media company plans to build a nightly batch job that merges three existing datasets before loading results into BigQuery: (1) terabytes of raw video metadata currently stored in an object store, (2) historical analytics already persisted in the enterprise data warehouse, and (3) transactional user-profile data that must remain globally consistent across regions and support SQL queries. Which Google Cloud services should be configured as the batch sources for this pipeline?
For a daily batch workflow you should choose batch-oriented, durable storage systems that already hold the required data:
Raw video metadata in an object store maps directly to Cloud Storage, Google Cloud's highly durable object storage service.
Historical analytics data kept in the existing warehouse can be read from BigQuery, which supports batch exports and queries.
Globally consistent, strongly consistent relational user-profile data is best stored in Cloud Spanner, Google Cloud's multi-region, horizontally scalable relational database.
The combination of Cloud Storage, BigQuery, and Spanner therefore covers all three data sources and is fully supported as batch inputs. The other options are unsuitable:
Pub/Sub is primarily a streaming ingestion service, not intended as a batch data store.
Datastore is a document NoSQL database and does not provide strong SQL semantics or multi-region consistency for relational workloads.
Bigtable is a wide-column NoSQL store optimized for low-latency access, not relational queries, and Dataproc is a processing engine, not a data source. Hence, these alternatives do not satisfy the stated requirements.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Can you explain why Cloud Spanner is suitable for globally consistent transactional user-profile data?
Open an interactive chat with Bash
What is the role of Cloud Storage in this pipeline?
Open an interactive chat with Bash
Why is BigQuery used for historical analytics data instead of other storage options?
Open an interactive chat with Bash
What is Cloud Spanner and why is it used for globally consistent relational data?
Open an interactive chat with Bash
How does BigQuery handle batch workflows and historical analytics?
Open an interactive chat with Bash
Why is Cloud Storage suitable for storing large raw video metadata?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .