5.1 Metrics
How to we quantify the difference between the forecast and the actual values in the test data set?
Let’s take the example of a training set/test set.
The forecast errors are the difference between the test data and the forecasts.
fr.err <- testdat - fr$mean
fr.err
## Time Series:
## Start = 1988
## End = 1989
## Frequency = 1
## [1] -0.1704302 -0.4944778
5.1.1 accuracy()
function
The accuracy()
function in forecast provides many different metrics such as mean error, root mean square error, mean absolute error, mean percentage error, mean absolute percentage error. It requires a forecast object and a test data set that is the same length.
accuracy(fr, testdat)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.00473511 0.1770653 0.1438523 -0.1102259 1.588409 0.7698386
## Test set -0.33245398 0.3698342 0.3324540 -3.4390277 3.439028 1.7791577
## ACF1 Theil's U
## Training set -0.04312022 NA
## Test set -0.50000000 1.90214
The metrics are:
ME Mean err
me <- mean(fr.err)
me
## [1] -0.332454
RMSE Root mean squared error
rmse <- sqrt(mean(fr.err^2))
rmse
## [1] 0.3698342
MAE Mean absolute error
mae <- mean(abs(fr.err))
mae
## [1] 0.332454
MPE Mean percentage error
fr.pe <- 100*fr.err/testdat
mpe <- mean(fr.pe)
mpe
## [1] -3.439028
MAPE Mean absolute percentage error
mape <- mean(abs(fr.pe))
mape
## [1] 3.439028
accuracy(fr, testdat)[,1:5]
## ME RMSE MAE MPE MAPE
## Training set -0.00473511 0.1770653 0.1438523 -0.1102259 1.588409
## Test set -0.33245398 0.3698342 0.3324540 -3.4390277 3.439028
c(me, rmse, mae, mpe, mape)
## [1] -0.3324540 0.3698342 0.3324540 -3.4390277 3.4390277
5.1.2 Test multiple models
Now that you have some metrics for forecast accuracy, you can compute these for all the models in your candidate set.
# The model picked by auto.arima
fit1 <- forecast::Arima(traindat, order=c(0,1,1))
fr1 <- forecast::forecast(fit1, h=2)
test1 <- forecast::accuracy(fr1, testdat)[2,1:5]
# AR-1
fit2 <- forecast::Arima(traindat, order=c(1,1,0))
fr2 <- forecast::forecast(fit2, h=2)
test2 <- forecast::accuracy(fr2, testdat)[2,1:5]
# Naive model with drift
fit3 <- forecast::rwf(traindat, drift=TRUE)
fr3 <- forecast::forecast(fit3, h=2)
test3 <- forecast::accuracy(fr3, testdat)[2,1:5]
Show a summary
ME | RMSE | MAE | MPE | MAPE | |
---|---|---|---|---|---|
(0,1,1) | -0.293 | 0.320 | 0.293 | -3.024 | 3.024 |
(1,1,0) | -0.309 | 0.341 | 0.309 | -3.200 | 3.200 |
Naive | -0.483 | 0.510 | 0.483 | -4.985 | 4.985 |
5.1.3 Cross-validation
Computing forecast errors and performance metrics with time series cross-validation is similar to the training set/test test approach.
The first step to using the tsCV()
function is to define the function that returns a forecast for your model. Your function needs to take x
, a time series, and h
the length of the forecast. You can also have other arguments if needed. Here is an example function for a forecast from an ARIMA model.
fun <- function(x, h, order){
forecast::forecast(Arima(x, order=order), h=h)
}
We pass this into the tsCV()
function. tsCV()
requires our dataset and our forecast function. The arguments after the forecast function are those we included in our fun
definition. tsCV()
returns a time series of errors.
e <- forecast::tsCV(traindat, fun, h=1, order=c(0,1,1))
We then can compute performance metrics from these errors.
tscv1 <- c(ME=mean(e, na.rm=TRUE), RMSE=sqrt(mean(e^2, na.rm=TRUE)), MAE=mean(abs(e), na.rm=TRUE))
tscv1
## ME RMSE MAE
## 0.1128788 0.2261706 0.1880392
Cross-validation farther in future
Compare accuracy of forecasts 1 year out versus 4 years out. If h
is greater than 1, then the errors are returned as a matrix with each h
in a column. Column 4 is the forecast, 4 years out.
e <- forecast::tsCV(traindat, fun, h=4, order=c(0,1,1))[,4]
#RMSE
tscv4 <- c(ME=mean(e, na.rm=TRUE), RMSE=sqrt(mean(e^2, na.rm=TRUE)), MAE=mean(abs(e), na.rm=TRUE))
rbind(tscv1, tscv4)
## ME RMSE MAE
## tscv1 0.1128788 0.2261706 0.1880392
## tscv4 0.2839064 0.3812815 0.3359689
As we would expect, forecast errors are higher when we make forecasts farther into the future.
Cross-validation with a fixed window
Compare accuracy of forecasts with a fixed 10-year window and 1-year out forecasts.
e <- forecast::tsCV(traindat, fun, h=1, order=c(0,1,1), window=10)
#RMSE
tscvf1 <- c(ME=mean(e, na.rm=TRUE), RMSE=sqrt(mean(e^2, na.rm=TRUE)), MAE=mean(abs(e), na.rm=TRUE))
tscvf1
## ME RMSE MAE
## 0.1387670 0.2286572 0.1942840
All the forecasts tests together
Here are all 4 types of forecasts tests together. There is not right approach. Time series cross-validation has the advantage that you test many more forecasts and use all your data.
comp.tab <- rbind(train.test=test1[c("ME","RMSE","MAE")],
tsCV.variable1=tscv1,
tsCV.variable4=tscv4,
tsCV.fixed1=tscvf1)
knitr::kable(comp.tab, format="html")
ME | RMSE | MAE | |
---|---|---|---|
train.test | -0.2925326 | 0.3201093 | 0.2925326 |
tsCV.variable1 | 0.1128788 | 0.2261706 | 0.1880392 |
tsCV.variable4 | 0.2839064 | 0.3812815 | 0.3359689 |
tsCV.fixed1 | 0.1387670 | 0.2286572 | 0.1942840 |