load("landings.RData")
landings$log.metric.tons = log(landings$metric.tons)
landings = subset(landings, Year <= 1987)
Add t
to the data. This is Year minus first Year.
landings$t = landings$Year-landings$Year[1]
We will look at sardines and a 4th order polynomial. h is the forecast length. h=5 is 5 years.
val <- "Sardine"
poly.order <- 4
h <- 5
Use the poly function
\[log(Sardine catch) = \alpha + \beta t + \beta_2 t^2 + \beta_3 t^3 + \beta_4 t^4 + e_t\]
model <- lm(log.metric.tons ~ poly(t, poly.order), data=landings, subset=Species==val)
Use the predict()
function. First you must set up the newdata data.frame. This is specific to the predict()
function. Let’s predict for the next 5 t after the end of the data.
dat <- subset(landings, Species==val)
dat$t
## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
The newdata data.frame must have the name the same as your predictor t
and have the values at the t
you want to forecast.
last.t <- max(dat$t)
nd <- data.frame(t = last.t + 1:h)
Now you can predict:
predict(model, newdata = nd)
## 1 2 3 4 5
## 9.241230 9.293302 9.387034 9.532185 9.739329
That is for years
max(dat$Year) + 1:h
## [1] 1988 1989 1990 1991 1992
You can use predict for this too. lwr
is the lower 95% prediction and upr is the upper 95% prediction.
predict(model, newdata = nd, interval="prediction")
## fit lwr upr
## 1 9.241230 8.946917 9.535543
## 2 9.293302 8.881264 9.705340
## 3 9.387034 8.802235 9.971834
## 4 9.532185 8.714969 10.349401
## 5 9.739329 8.623837 10.854821
Show in metric tons instead of log metric tons.
pr <- predict(model, newdata = nd, interval="prediction")
pr <- as.data.frame(exp(pr))
pr$Year <- max(dat$Year) + 1:h
Make table in R markdown
library(knitr)
kable(pr)
fit | lwr | upr | Year |
---|---|---|---|
10313.72 | 7684.167 | 13843.11 | 1988 |
10865.00 | 7195.879 | 16404.98 | 1989 |
11932.66 | 6649.086 | 21414.72 | 1990 |
13796.70 | 6093.444 | 31238.33 | 1991 |
16972.15 | 5562.689 | 51783.21 | 1992 |
pr <- predict(model, newdata = nd, interval="prediction")
pr <- as.data.frame(exp(pr))
ylims <- c(min(dat$metric.tons, pr$fit), max(dat$metric.tons, pr$fit))
pr$Year <- max(dat$Year) + 1:h
plot(dat$Year, dat$metric.tons, xlim=c(min(dat$Year), max(dat$Year)+h),
ylim=ylims)
lines(dat$Year, exp(fitted(model)), col="red")
lines(pr$Year, pr$fit)
Here is a function I created to make forecast plots using ggplot().
You can plot with
plotforecasttv(model)
You model fits the data nicely, but your forecasts are very uncertain as you move farther out. This is a property of very flexible models. They fit the data, but tend to overfit so forecasts are more not less uncertain.
Fit a 1st order polynomial to the sardine data. Create a forecast for 10 years into the future.
Create forecasts for one of the other species.
unique(landings$Species)
## [1] Anchovy Sardine Chub.mackerel Horse.mackerel Mackerel
## [6] Jack.Mackerel
## 6 Levels: Anchovy Sardine Chub.mackerel Horse.mackerel ... Jack.Mackerel