A data scientist is forecasting the daily number of support tickets for an online SaaS platform. After fitting an ARIMA(1,1,1) model without any seasonal components, she examines the residual diagnostics and finds:
The residual ACF shows statistically significant positive spikes at lags 7, 14, 21 and 28, while the remaining lags fall inside the 95% confidence bands.
The residual PACF tapers rapidly after lag 1.
A KPSS test on the residuals fails to reject the null hypothesis of level stationarity.
Which underlying data issue is most likely producing the repeating autocorrelation pattern, and what is the most effective first modelling adjustment?
A persistent linear trend; apply a first difference to remove the trend.
Non-linear variance growth; perform a Box-Cox power transformation before refitting.
High multicollinearity among exogenous predictors; compute variance-inflation factors and drop redundant features.
Weekly seasonality; apply a seasonal difference at lag 7 and include seasonal AR or MA terms (convert to a SARIMA model).
Spikes in the residual ACF at regular multiples of a fixed lag indicate that a repeating pattern of that length remains unmodelled. For daily data, significant autocorrelation at lags 7, 14, 21 and 28 is characteristic of a weekly seasonal cycle. Because the residuals are otherwise stationary (KPSS does not reject), the primary deficiency is the absence of a seasonal component rather than an unremoved trend or variance problem. Introducing a seasonal difference at lag 7 (D = 1, m = 7) and/or adding seasonal AR or MA terms converts the model to a SARIMA specification that can capture the weekly pattern, thereby eliminating the systematic residual correlation. Trend differencing, multicollinearity diagnostics and variance-stabilising transformations would not address this seasonality-driven autocorrelation.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a SARIMA model, and how is it different from an ARIMA model?
Open an interactive chat with Bash
What does a statistically significant spike in the residual ACF indicate in a time series model?
Open an interactive chat with Bash
What is the purpose of seasonal differencing in time series analysis?