Holt Winters Forecasting with CVG Temperature Data
Micrsoft Office stock photo
Introduction
Data
The goal of this exercise is to use Holt-Winters forecasting, also known as triple exponential smoothing, to predict future monthly average temperatures in the Cincinnati area. The data was taken from the National Weather Service, and contains the monthly temperature averages in Fahrenheit from January 2000 through December 2022.
I used the great e-textbook Forecasting: Principles and Practice by Hyndman and Athanasopoulos to inform my knowledge and methodology.
Holt-Winters
Exponential smoothing is a forecasting model that weights previous observations to inform its predicted values. As observations get older, their weights are reduced exponentially. This method did not account for trend or seasonality, but was adjusted by Holt, and later Winters, to capture both trend and seasonality, hence the name Holt-Winters.
Model parameters, as explained by Hyndman and Athanasopoulos:
“The Holt-Winters seasonal method comprises the forecast equation and three smoothing equations — one for the level ℓt, one for the trend bt, and one for the seasonal component st, with corresponding smoothing parameters α, β∗ and γ.”
α, β and γ adjust the weight on recent observations (accounts for noise/randomness), trend, and seasonality, respectively.
Setup
Reshaping data
library(reshape2)
library(dplyr)
library(lubridate)
library(data.table)
library(readxl)
library(forecast)
library(ggplot2)
library(seasonal)Loading temperature data:
temps <- read_excel("temps.xlsx", col_names = FALSE)
head(temps)## # A tibble: 6 × 13
## ...1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12 ...13
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2 2000 27 37.5 45.9 51.4 64.9… 71.5… 72.4… 71.3 64.8 57.1 41 23.3
## 3 2001 28.7 35.29… 38.1 56.8 63.6 71.0… 74.3 75.2 64.4… 55.7 49.6 38.4
## 4 2002 35.6 35.9 41.3 54.6 58.7 73.5 77.9… 76.0… 70.7 53.5 41.4 32.4
## 5 2003 22.6 26.9 43.2 54.9 60.8 67.3 73.4… 73.5… 64.4… 52.8 47.3 33.7…
## 6 2004 24.2 31.9 43.5 52.6 66.8 70.0… 73.5… 70.5… 68 55 46.2 32.6
We need to get this into time series format. Transposing and using reshape2 to melt the data frame:
#Transpose
temps_t <- transpose(temps)
#Fix column names
names(temps_t) <- temps_t[1,]
temps_t <- temps_t[-1,]
colnames(temps_t)[1] <- "Month"
#Melt
molten <- melt(temps_t, id.vars = c("Month"))
head(molten)## Month variable value
## 1 Jan 2000 27
## 2 Feb 2000 37.5
## 3 Mar 2000 45.9
## 4 Apr 2000 51.4
## 5 May 2000 64.900000000000006
## 6 Jun 2000 71.599999999999994
Next, converting to time series:
molten$date <- paste(molten$Month, molten$variable, sep="-")
df <- molten %>% select(-c(Month,variable))
df$date <- my(df$date)
df$value <- as.numeric(df$value)
temps.ts <- ts(df$value, frequency = 12, start = c(2000, 1))
plot(temps.ts)Time series anaysis
Time series decomposition
We can decompose the time series to extract specific elements such as noise, trend, and seasonality. The method used is X11 decomposition.
temps.ts %>% seas(x11="") -> fit
autoplot(fit) +
ggtitle("X11 decomposition of average monthly Cincinnati temperatures") Here,
we can see an unclear trend, strong seasonality, and moderate
randomness.
Holt-Winters implementation
Model selection
The ets() function from the forecast package will automatically select a model based on the corrected Akaike information criterion, or AICc.
fit <- ets(temps.ts)
summary(fit)## ETS(A,N,A)
##
## Call:
## ets(y = temps.ts)
##
## Smoothing parameters:
## alpha = 0.004
## gamma = 1e-04
##
## Initial states:
## l = 53.8914
## s = -19.0083 -9.7741 1.8205 14.0466 20.593 21.6879
## 18.3794 9.9589 -0.1465 -11.2616 -21.8917 -24.4044
##
## sigma: 3.6412
##
## AIC AICc BIC
## 2280.224 2282.070 2334.530
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.115928 3.547672 2.727518 -0.6958386 6.442588 0.6933699
## ACF1
## Training set 0.09159075
From the output, we can see that ets() generated a model with additive α, no β, and additive γ.
Per Hyndman and Athanasopoulos,
“The additive method is preferred when the seasonal variations are roughly constant through the series, while the multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series. With the additive method, the seasonal component is expressed in absolute terms in the scale of the observed series, and in the level equation the series is seasonally adjusted by subtracting the seasonal component. Within each year, the seasonal component will add up to approximately zero. With the multiplicative method, the seasonal component is expressed in relative terms (percentages), and the series is seasonally adjusted by dividing through by the seasonal component.”
The omission of β lines up, as the decomposition did not show any clear trend.
Forecasting
Predicting 2023’s temperatures:
fit %>% forecast(h=12) %>%
autoplot() + xlim(2020,2025) + ylab("Temperature (F)")model <- fit %>% forecast(h=12)Text output of forecast:
summary(model)##
## Forecast method: ETS(A,N,A)
##
## Model Information:
## ETS(A,N,A)
##
## Call:
## ets(y = temps.ts)
##
## Smoothing parameters:
## alpha = 0.004
## gamma = 1e-04
##
## Initial states:
## l = 53.8914
## s = -19.0083 -9.7741 1.8205 14.0466 20.593 21.6879
## 18.3794 9.9589 -0.1465 -11.2616 -21.8917 -24.4044
##
## sigma: 3.6412
##
## AIC AICc BIC
## 2280.224 2282.070 2334.530
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.115928 3.547672 2.727518 -0.6958386 6.442588 0.6933699
## ACF1
## Training set 0.09159075
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2023 29.61491 24.94850 34.28133 22.47824 36.75158
## Feb 2023 32.12756 27.46111 36.79402 24.99084 39.26429
## Mar 2023 42.75761 38.09112 47.42410 35.62083 49.89439
## Apr 2023 53.87266 49.20613 58.53919 46.73582 61.00950
## May 2023 63.97825 59.31168 68.64481 56.84135 71.11514
## Jun 2023 72.39888 67.73228 77.06548 65.26193 79.53583
## Jul 2023 75.70724 71.04060 80.37388 68.57023 82.84425
## Aug 2023 74.61246 69.94578 79.27913 67.47539 81.74952
## Sep 2023 68.06591 63.39920 72.73262 60.92879 75.20303
## Oct 2023 55.84010 51.17335 60.50685 48.70292 62.97728
## Nov 2023 44.24518 39.57840 48.91197 37.10795 51.38242
## Dec 2023 35.01068 30.34386 39.67751 27.87339 42.14798
Let’s check back in a year and see the accuracy of 2023’s forecast!