5.3 Testing the candidate model set

With a set of candidate models, we can prepare tables showing the forecast performance for each model.

For each model, we will do the same steps:

  1. Fit the model

  2. Create forecasts for a test data set or use cross-validation

  3. Compute forecast accuracy metrics for the forecasts

Note when you compare models, you can use both ‘training data/test data’ and use time-series cross-validation, but report the metrics in separate columns. Example, ‘RMSE from tsCV’ and ‘RMSE from test data’.


5.3.1 Fit each of our candidate models

We will define the training data as 1964 to 1987 and the test data as 1988 and 1989. The full data is 1964 to 1989.

fulldat <- window(anchovyts, 1964, 1989)
traindat <- window(anchovyts, 1964, 1987)
testdat <- window(anchovyts, 1988, 1989)

We will store our fits and forecasts in a list for easy access. fun.list is the function to pass to tsCV().

fit.list <- list()
fr.list <- list()
fun.list <- list()
n.fr <- length(testdat)

For each model, we will fit, forecast, and define a forecast function.

  • Exponential smoothing model with trend
modelname <- "ETS w trend"
fit <- ets(traindat, model="AAN")
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ forecast(ets(x, model="AAN"), h=h) }
  • Exponential smoothing model no trend
modelname <- "ETS no trend"
fit <- ets(traindat, model="ANN")
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ forecast(ets(x, model="ANN"), h=h) }
  • ARIMA(0,1,1) with drift (best)
modelname <- "ARIMA(0,1,1) w drift"
fit <- Arima(traindat, order=c(0,1,1), include.drift=TRUE)
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ forecast(Arima(x, order=c(0,1,1), include.drift=TRUE),h=h) }
  • ARIMA(2,1,0) with drift (within 2 AIC of best)
modelname <- "ARIMA(2,1,0) w drift"
fit <- Arima(traindat, order=c(2,1,0), include.drift=TRUE)
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ forecast(Arima(x, order=c(2,1,0), include.drift=TRUE),h=h) }
  • Time-varying regression with linear time
TT <- length(traindat)
#make a data.frame for lm
dat <- data.frame(log.metric.tons=traindat, t=1:TT)
modelname <- "TV linear regression"
fit <- lm(log.metric.tons ~ t, data=dat)
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, newdata=data.frame(t=TT+1:n.fr))
fun.list[[modelname]] <- function(x, h){ 
  TT <- length(x)
  dat <- data.frame(log.metric.tons=x, t=1:TT)
  ft <- lm(log.metric.tons ~ t, data=dat)
  forecast(ft, newdata=data.frame(t=TT+h))
  }
  • Naive no trend
modelname <- "Naive"
fit <- Arima(traindat, order=c(0,1,0))
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ rwf(x,h=h) }
  • Naive with trend
modelname <- "Naive w trend"
fit <- Arima(traindat, order=c(0,1,0), include.drift=TRUE)
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ rwf(x, drift=TRUE, h=h) }
  • Average or mean
modelname <- "Average"
fit <- Arima(traindat, order=c(0,0,0))
fit.list[[modelname]] <- fit
fr.list[[modelname]] <- forecast(fit, h=n.fr)
fun.list[[modelname]] <- function(x, h){ forecast(Arima(x, order=c(0,0,0)),h=h) }