Chapter 5 Testing forecast accuracy

Once you have found a set of possible forecast models, you are ready to compare forecasts from a variety of models and choose a forecast model.

To quantify the forecast performance, we need to create forecasts for data that we have so that we can compare the forecast to actual data. There are two approaches to this.

5.0.1 Training set/test set

One approach is to ‘hold out’ some of your data as the test data and did not use it at all in your fitting. To measure the forecast performance, you fit to your training data and test the forecast against the data in the test set. This is the approach that Stergiou and Christou used.

Stergiou and Christou used 1964-1987 as their training data and tested their forecasts against 1988 and 1989.

Forecast versus actual

We will fit to the training data and make a forecast for the test data. We can then compare the forecast to the actual values in the test data.

fit1 <- forecast::auto.arima(traindat)
fr <- forecast::forecast(fit1, h=2)
fr

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 1988       10.03216 9.789577 10.27475 9.661160 10.40317
## 1989       10.09625 9.832489 10.36001 9.692861 10.49964

Plot the forecast and compare to the actual values in 1988 and 1989.

plot(fr)
points(testdat, pch=2, col="red")
legend("topleft", c("forecast","actual"), pch=c(20,2), col=c("blue","red"))

5.0.2 Cross-Validation

An alternate approach to is to use cross-validation. This approach uses windows or shorter segments of the whole time series to make a series of single forecasts. We can use either a variable length or a fixed length window.

Variable window

For the variable length window approach applied to the Anchovy time series, we would fit the model 1964-1973 and forecast 1974, then 1964-1974 and forecast 1975, then 1964-1975 and forecast 1976, and continue up to 1964-1988 and forecast 1989. This would create 16 forecasts which we would compare to the actual landings. The window is ‘variable’ because the length of the time series used for fitting the model, keeps increasing by 1.

Fixed window

Another approach uses a fixed window. For example, a 10-year window.

Cross-validation farther in future

Sometimes it makes more sense to test the performance for forecasts that are farther in the future. For example, if the data from your catch surveys takes some time to process, then you might need to make forecasts that are farther than 1 year from your last data point.

In that case, there is a gap between your training data and your test data point.