During a severe-weather season, a municipal emergency-operations center wants to create a real-time dashboard that surfaces new public-safety incidents from a continuous Twitter firehose. The system must (a) trigger an alert within seconds of a collective event starting, (b) work on streaming text without any pre-labeled event examples, and (c) merge related tweets into a single incident summary so operators are not flooded with duplicate messages.
Which processing approach best satisfies all three requirements while keeping false-positive alerts low?
Fit a seasonal ARIMA model to total tweet volume and raise an alert whenever the forecasting residual exceeds a threshold.
Apply burst detection to per-keyword time series and then use graph-based clustering of co-occurring terms to build event clusters in real time.
Run a pre-trained BERT sentiment classifier on every tweet and alert when the negative-sentiment probability crosses 0.8.
Vectorize each tweet with static TF-IDF features and run k-means clustering once per night to discover trending topics.
Burst-detection algorithms monitor the arrival rate of individual keywords and raise a signal the moment a term's frequency rises sharply. Because they run as an online probabilistic automaton, they can react in sub-second time frames without needing pre-labeled data. Once a bursty term is flagged, a graph-based clustering step groups tweets that share co-occurring terms or user interactions, yielding one coherent cluster per incident and filtering noise. Peer-reviewed work shows that this combination achieves high precision for high-impact, breaking events in social-media streams.
Static TF-IDF vectors with nightly k-means violate the timeliness constraint and ignore newly emerging terms. Seasonal ARIMA on total tweet volume can detect spikes but cannot distinguish different incidents, and volume-only methods are prone to "alert swamping." Sentiment classification with a BERT model measures emotional tone, not whether a specific, shared event is unfolding, so it cannot reliably aggregate posts into incident clusters.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is burst detection in data processing?
Open an interactive chat with Bash
How does graph-based clustering work in event detection?
Open an interactive chat with Bash
Why is TF-IDF with k-means clustering not suitable for real-time processing?