You are designing an Apache Beam pipeline to run on Cloud Dataflow that must process IoT temperature events published to Pub/Sub in real time. The business needs to calculate a rolling 5-minute average temperature per device and write the results to BigQuery every minute. Events can arrive late for up to 10 minutes and must still update the previously emitted aggregates. The same transformation logic must also be reused in a nightly back-fill job that reprocesses the last month of raw events stored in Cloud Storage. Which Beam configuration best satisfies these functional requirements while minimizing duplicated code?
Use fixed 5-minute windows with the default watermark, omit allowed lateness, and maintain two separate pipelines-one for streaming from Pub/Sub and one for batch from Cloud Storage.
Use session windows with a 10-minute inactivity gap, enable speculative processing triggers, and add custom side inputs when running the batch back-fill job.
Configure a 5-minute sliding window with a 1-minute hop, set allowed lateness to 10 minutes, and run the same Beam pipeline against Pub/Sub for streaming and Cloud Storage for batch back-fill.
Apply a global window with processing-time triggers that fire every minute and filter out records arriving later than the trigger to keep aggregates stable.
A sliding (also called hopping) window of 5 minutes with a 1-minute hop continuously produces an updated average every minute while still including the full five-minute history. Setting an allowed-lateness of 10 minutes tells Dataflow to keep the window open for late elements so that aggregates can be recomputed and written again to BigQuery. Because Apache Beam pipelines are runner-agnostic, the exact same pipeline definition can be executed against an unbounded Pub/Sub source (streaming) or a bounded Cloud Storage source (batch), eliminating the need to maintain separate code paths. Fixed windows would emit only one result per five-minute slice and, without allowed lateness, would drop late data. Session windows group by activity gaps and do not guarantee the regular one-minute updates required. A global window with processing-time triggers would drop late records once they miss the trigger and therefore would not satisfy the retroactive update requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the purpose of sliding windows in Apache Beam?
Open an interactive chat with Bash
How does allowed lateness affect data processing in Apache Beam?
Open an interactive chat with Bash
What makes Apache Beam's pipelines reusable for both streaming and batch jobs?
Open an interactive chat with Bash
What is a sliding window in Apache Beam, and how does it work?
Open an interactive chat with Bash
What is allowed lateness, and why is it important in Apache Beam pipelines?
Open an interactive chat with Bash
How does Apache Beam support runner-agnostic pipelines for both streaming and batch processing?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .