--- title: "Forecasting Recessions with Dynamic Logistic Regression and Dynamic Model Averaging" subtitle: "preliminary analysis" author: "Len Kiefer" date: "2017/10/31" output: xaringan::moon_reader: yolo: FALSE css: mycss.css lib_dir: libs nature: highlightStyle: github highlightLines: true countIncrementalSlides: false --- {r setup, include=FALSE,message=F,warning=F} options(htmltools.dir.version = FALSE) library(tidyverse) library(boot) library(viridis) library(cowplot) library(stargazer) library(lubridate)  # Overview * Motivation * Statistical model(s) * Empirical Excercise * Discussion --- class: inverse, center, middle # Motivation ## Uncertainty and Ambiguity --- class: left, top ### Statistical model(s) Application of *Dynamic Logistic Regression and Dynamic Model Averaging for Binary Classification* [LINK](https://www.ncbi.nlm.nih.gov/pubmed/21838812) We want to predict a binary outcome: $$y_{t} \in (0,1)$$ where 1 (0) indicates recession (no recession). $$y_{t}|L_{t}=k\sim Bernoulli(p_{t}^{(k)})$$ $$Logit(p_{t}^{(k)}) = x^{(k)'}_{t-h}\theta^{(k)}_{t-h}$$ with $k=1,2,\ldots,K$ possible models. --- class: inverse, center, middle ### Each model just your standard S curve {r 10-28-2017,eval=T,echo=F,message=F,warning=F,fig.width=10} library(tidyverse) library(boot) # has useful inv.logit function library(viridis) #extrafont::loadfonts(device="win") # needed for fonts (on windows, not sure about unix/mac) #Create spooky dark theme: theme_spooky = function(base_size = 10, base_family = "Chiller") { theme_grey(base_size = base_size, base_family = base_family) %+replace% theme( # Specify axis options axis.line = element_blank(), axis.text.x = element_text(size = base_size*0.8, color = "white", lineheight = 0.9), axis.text.y = element_text(size = base_size*0.8, color = "white", lineheight = 0.9), axis.ticks = element_line(color = "white", size = 0.2), axis.title.x = element_text(size = base_size, color = "white", margin = margin(0, 10, 0, 0)), axis.title.y = element_text(size = base_size, color = "white", angle = 90, margin = margin(0, 10, 0, 0)), axis.ticks.length = unit(0.3, "lines"), # Specify legend options legend.background = element_rect(color = NA, fill = " gray10"), legend.key = element_rect(color = "white", fill = " gray10"), legend.key.size = unit(1.2, "lines"), legend.key.height = NULL, legend.key.width = NULL, legend.text = element_text(size = base_size*0.8, color = "white"), legend.title = element_text(size = base_size*0.8, face = "bold", hjust = 0, color = "white"), legend.position = "none", legend.text.align = NULL, legend.title.align = NULL, legend.direction = "vertical", legend.box = NULL, # Specify panel options panel.background = element_rect(fill = " gray10", color = NA), #panel.border = element_rect(fill = NA, color = "white"), panel.border = element_blank(), panel.grid.major = element_line(color = "grey35"), panel.grid.minor = element_line(color = "grey20"), panel.spacing = unit(0.5, "lines"), # Specify facetting options strip.background = element_rect(fill = "grey30", color = "grey10"), strip.text.x = element_text(size = base_size*0.8, color = "white"), strip.text.y = element_text(size = base_size*0.8, color = "white",angle = -90), # Specify plot options plot.background = element_rect(color = " gray10", fill = " gray10"), plot.title = element_text(size = base_size*1.2, color = "white",hjust=0,lineheight=1.25, margin=margin(2,2,2,2)), plot.subtitle = element_text(size = base_size*1, color = "white",hjust=0, margin=margin(2,2,2,2)), plot.caption = element_text(size = base_size*0.8, color = "white",hjust=0), plot.margin = unit(rep(1, 4), "lines") ) } mytheta <- c(-2,-1,0,1,2) #Possible values of theta df.m<-expand.grid(x=seq(-3,3,.01),mytheta=mytheta) %>% mutate(y=inv.logit(mytheta*x)) ggplot(data=df.m, aes(x=x,y=y,color=factor(mytheta), group=mytheta ))+ geom_line(size=1.05)+theme_spooky(base_family="Arial",base_size=20)+ #geom_text(data=filter(df.m,x==max(df.m$x)),size=5,fontface="bold")+ #scale_color_discrete(name=expression(theta))+ scale_color_viridis(discrete=T,option="C",name=expression(theta))+ theme(legend.position="top",legend.direction="horizontal")+ labs(title=expression(paste("Probability Y=1 given different values of x and ",theta)), subtitle=expression(paste("Logit(P(Y = 1 | x)) = x'",theta)), y="P(Y = 1 | x)")  --- class: left, top ### Suppose k=2 $$Logit(P(y_{t}=1|x,L=1))= x_{1t}^{'}\theta_{t-h}^{(1)}$$ and $$Logit(P(y_{t}=1|x,L=2))=x_{2t}^{'}\theta_{t-h}^{(2)}$$. Parameters might be unstable and drift over time. A simple model of drift would be: $$\theta_{t}^{(k)}=\theta_{t-1}^{(k)}+\delta^{(k)}_{t},\ \ \ \delta^{(k)}_{t} \sim N(0,\sigma^{2})$$ Suppose we have a way to estimate each model, how do we combine the results from$K$potential models? --- class: left, top ### Doing in the hard way Specify: $$P(L_{t}=k|Y^{t-1}) = \sum_{\ell=1}^{K}p(L_{t-1}=\ell|Y^{t-1})p(L_{t}=k|L_{t-1}=\ell)$$ but$p(L_{t}=k|L_{t-1}=\ell)$requires a$K\times K$transition matrix. --- class: left, top ### Generally a hard problem to solve so forget about it - Simplify via *forgetting factors* + See [this paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2895940/) *Online Prediction Under Model Uncertainty via Dynamic Model Averaging: Application to a Cold Rolling Mill* + See [this paper](https://www.ncbi.nlm.nih.gov/pubmed/21838812) *Dynamic Logistic Regression and Dynamic Model Averaging for Binary Classification* + See [this paper](http://onlinelibrary.wiley.com/doi/10.1111/j.1468-2354.2012.00704.x/pdf) *Forecasting Inflation Using Dynamic Model Averaging* + [R](https://www.r-project.org/) package [dma](https://CRAN.R-project.org/package=dma) --- class: left, to ### Forgetting instead of this: $$p(L_{t}=k|L_{t-1}=\ell)$$ use: $$P(L_{t}=k | Y^{t-1})= \frac{P(L_{t-1}=k | Y^{t-1})^{\alpha} }{\sum_{j=1}^{K}P(L_{t-1}=j | Y^{t-1})^{\alpha}}$$ Instead of specifying a full$K \times K$transition Matrix P for model probabilities, probabilities are updated using *forgetting*, parameterized by$0<\alpha\leq1$. The parameter$\alpha$determines how much weight is put on recent models. With$\alpha=1$there is no forgetting, but$\alpha<1puts less weight on observations in the distant past. --- class: inverse, center, middle # Empirical example ### Forecasting recessions --- class: top, left ### Forecasting recessions {r 10-30-2017-plot-3,eval=T,echo=F,message=F,warning=F,fig.width=10} load("data/dma_oct2017.Rdata") # employment plot g.emp<- ggplot(data=df4, aes(x=date,y=PAYEMS))+ geom_rect(data=recessions.df, inherit.aes=FALSE, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='lightblue', alpha=0.5)+theme_minimal()+geom_line(color="royalblue")+ labs(x="",y="",title="Employment growth", subtitle="3-month % change", caption="@lenkiefer Source: U.S. Bureau of Labor Statistics, Federal Reserve (H.15), shaded area NBER recessions.\nretrieved from FRED, Federal Reserve Bank of St. Louis, October 26, 2017.")+ geom_hline(yintercept=0,color="black")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14)) # yield curve plot g.slope<- ggplot(data=df4, aes(x=date,y=SLOPE))+ geom_rect(data=recessions.df, inherit.aes=FALSE, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='lightblue', alpha=0.5)+theme_minimal()+geom_line(color="royalblue")+ labs(x="",y="",title="Yield curve slope", subtitle="10-year minus 3-month U.S. Treasury rates", caption="")+ geom_hline(yintercept=0,color="black")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14)) # combined plot (uses cowplot::plot_grid) plot_grid(g.slope,g.emp,ncol=1)  --- class: top, left ### Static regression {r 10-26-2017-reg-glm-1, echo=F,message=F,warning=F,results='asis'} # forecast contemporaneously (h=0) glm.h0<- glm(USREC ~ SLOPE + PAYEMS,family=binomial(link='logit'), data=df4) # forecast 12 months ahead (h=12) glm.h1<- glm(REC12 ~ SLOPE + PAYEMS,family=binomial(link='logit'), data=df4) stargazer::stargazer(glm.h0,glm.h1, title="Forecasting Recessions", type="html", dep.var.caption = "Recession probability", dep.var.labels = c("contemporaneous","forecast"), column.labels = c("(h=0)", "(h=12)"), notes.align="l", omit.stat=c("aic","ll"), notes="Dependent variable is a 0/1 recession indicator. PAYEMS is 3-month % change in payroll employment. SLOPE is the slope of the U.S. Treasury yield curve (10-year minus 3-month Treasury rates)." )  --- class: left, middle ### Dynamic regressions (h=0) {r 10-30-2017-plot-h0-1,eval=T,echo=F,message=F,warning=F,fig.width=10} # plot, but exclude first 15 years ggplot(data=filter(df.plot,date>min(df.plotdate)+years(15)), aes(x=date,y=est,color=var, fill=var))+ geom_hline(yintercept=0,color="black")+ geom_ribbon(aes(ymin=est-2*v, ymax=est+2*v),alpha=.1, color=NA)+ geom_ribbon(aes(ymin=est-1*v, ymax=est+1*v),alpha=.25, color=NA)+ geom_line(aes(y=est-2*v),linetype=2)+ geom_line(aes(y=est+2*v),linetype=2)+ geom_line()+ guides(color=F,fill=F)+ theme_minimal()+ facet_grid(var~model,scales="free_y")+ labs(x="date",y=expression(paste("Coefficient estimate",theta[t-h])), title="Dynamic Logistic Regression (h=0)", subtitle=expression(paste("Logit(P(Y = 1 | Model=k,x)) = ",x[t-h],"'",theta[t-h])), caption="@lenkiefer Dynamic logistic regression coefficients for three models.\nSolid line coefficient estimates, shaded area +/- 1 & 2 standard errors.\nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14))  --- class: top, left ### Model probabilties (h=0) {r 10-30-2017-plot-h0-2,eval=T,echo=F,message=F,warning=F,fig.width=10} df.plot2<-data.frame(t(dma.fit0$pmp), date=df5$date) ggplot(data=gather(df.plot2,Model,p,-date), aes(x=date,y=p,color=Model))+ geom_line()+theme_minimal()+ theme(plot.caption=element_text(hjust=0),legend.position="top", plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14))+ labs(x="date",y="Model Probabilties", title="Dynamic Logistic Regression (h=0)", subtitle="Model Probabilities", caption="@lenkiefer \nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")  --- class: left, middle ### Dynamic regressions (h=12) {r 10-30-2017-plot-h12-1,eval=T,echo=F,message=F,warning=F,fig.width=10} ggplot(data=filter(df.plot12,date>min(df.plot12$date)+years(15)), aes(x=date,y=est,color=var, fill=var))+ geom_hline(yintercept=0,color="black")+ geom_ribbon(aes(ymin=est-2*v, ymax=est+2*v),alpha=.1, color=NA)+ geom_ribbon(aes(ymin=est-1*v, ymax=est+1*v),alpha=.25, color=NA)+ geom_line(aes(y=est-2*v),linetype=2)+ geom_line(aes(y=est+2*v),linetype=2)+ geom_line()+ guides(color=F,fill=F)+ theme_minimal()+ facet_grid(var~model,scales="free_y")+ labs(x="date",y=expression(paste("Coefficient estimate",theta[t-h])), title="Dynamic Logistic Regression (h=12)", subtitle=expression(paste("Logit(P(Y = 1 | Model=k,x)) = ",x[t-h],"'",theta[t-h])), caption="@lenkiefer Dynamic logistic regression coefficients for three models.\nSolid line coefficient estimates, shaded area +/- 1 & 2 standard errors.\nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14))  --- class: top, left ### Model probabilities (h=12) {r 10-30-2017-plot-h12-2,eval=T,echo=F,message=F,warning=F,fig.width=10} # plot model probabilities df.plot12.2<-data.frame(t(dma.fit12$pmp), date=df12\$date) ggplot(data=gather(df.plot12.2,Model,p,-date), aes(x=date,y=p,color=Model))+ geom_line()+theme_minimal()+ theme(plot.caption=element_text(hjust=0),legend.position="top", plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14))+ labs(x="date",y="Model Probabilties", title="Dynamic Logistic Regression (h=12)", subtitle="Model Probabilities", caption="@lenkiefer \nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")  --- class: top, left ### Combining forecasts {r 10-30-2017-plot-end,eval=T,echo=F,message=F,warning=F,fig.width=10} g.12<- ggplot(data=df6, aes(x=date+months(12),y=yhat12))+ geom_rect(data=recessions.df, inherit.aes=FALSE, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='lightblue', alpha=0.5)+theme_minimal()+ geom_hline(yintercept=0.5,color="black",linetype=2,size=0.75)+ geom_line(color="royalblue",size=1.05)+ scale_y_continuous(labels=scales::percent)+ labs(x="",y="recession probability", title="Estimated U.S. recession probabilities given by dynamic model averaging", subtitle="12 months ahead", caption="@lenkiefer Recession probabilities based on dynamic model averaging of three models\nforecasting recession 0 and 12-months ahead.\nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14)) g.0<- ggplot(data=df6, aes(x=date,y=yhat0))+ geom_rect(data=recessions.df, inherit.aes=FALSE, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='lightblue', alpha=0.5)+theme_minimal()+ geom_hline(yintercept=0.5,color="black",linetype=2,size=0.75)+ geom_line(color="red",size=1.05)+ scale_y_continuous(labels=scales::percent)+ labs(x="",y="recession probability", title="Estimated U.S. recession probabilities given by dynamic model averaging", subtitle="0 months ahead", caption="@lenkiefer Recession probabilities based on dynamic model averaging of three models\nforecasting recession 0 and 12-months ahead.\nModel 1: Yield curve slope (10-year minus 3-month U.S. Treasury yield)\nModel 2: Employment change (3 month %)\nModel 3: Yield curve slope and employment change\nModel fit with dma: Tyler H. McCormick, Adrian Raftery and David Madigan (2017).\ndma: Dynamic Model Averaging. R package version 1.3-0. https://CRAN.R-project.org/package=dma")+ theme(plot.caption=element_text(hjust=0), plot.subtitle=element_text(face="italic",size=9), plot.title=element_text(face="bold",size=14)) g.rec<- plot_grid(g.0+labs(caption=""),g.12+labs(title=""), ncol=1) g.rec  --- class: inverse, middle,center # Discussion ###How could we use this?