As an economist and all-around friend of strictly positive numbers I often use the log function. The natural logarithm of course, need I specify it? Apparently in certain spreadsheet software you do.
In this note I just wanted to write down a couple of observations about how to generate mean or median forecasts of a variable \(y\) given the model is fit in \(log(y)\). Of course, I am going to borrow heavily from Rob Hyndman’s blog, where he coverse this.

Mortgage interest rates have moved about a percentage point lower from where they were a year ago. The housing market seems to have responded favorably.
On my way into D.C. the other day to do some business, I joined a Twitter exchange originally between [at]Graykimbrough and Adam Ozimek, [at]ModeledBehavior about the effects of Federal Reserve interest policy on the housing market.
Seems unlikely housing market was slowed by trade war.

I’ve got a new working paper with Hua Kiefer (FDIC) and Diana Wei (OCC) that studies the dynamics of house prices and foreclosure rates across space and time. We estimate a model using a panel of state/quarters where nearby states influence one another.
Link to paper (pdf): What Happens in Vegas Doesn’t Always Stay in Vegas Note Updated May 17, 2019
I’m giving a talk on this paper at the American Real Estate and Urban Economics National Conference later this month.

The current economic expansion is set to enter its tenth year this summer. Assuming we make it to June, this will become the longest U.S. economic expansion in recorded history stretching back to the 19th century. But how is the housing market doing? After a decade of recovery housing market activity still has room for improvement, but trends in 2018 were negative. Home sales, housing construction and house price growth all declined in 2018.

My recent economic and housing market talks see for example here have been titled: “Will the U.S. housing market get back on track in 2019?”. My general conclusion has been cautiously optimistic. There is enough strength in the broader economy and enough of a tailwind from demographic forces to push the U.S. housing market to modest growth next year.
I still think that’s true, but as I have said in my talks, risks are weighted to the downside.

Let’s pick up where we left off yesterday and do some more exploration with text mining.
Like yesterday we’ll use the tidytext package for R. And we’ll lean heavily on Julie Silge and David Robinson’s Text Mining with R.
Data
We’ll turn again to the Federal Reserve for our text data. But today we’ll explore the Beige Book, which gathers anecdotal information on current economic conditions across the Federal Reserve Districts.

The Linear Probability Model (LPM) might be bad, but is it all bad? Let’s look at some conditions where the LPM might not be so bad. We’ll also look at some simple adjustments that might improve the performance of the LPM. We’ll also compare the LPM to some common alternatives.
Setup
Throughout most of this post, we’re going to consider a world where the LPM model is the true model.

I think a lot about predicting/forecasting binary outcomes. Will the economy head into a recession next year? What’s the likelihood of a loan defaulting over the next few years? Will my followers on social media abandon me if I tweet about my lunch?
One often maligned, but seemingly irresitable approach to modeling binary ourcomes is the Linear Probability Model (LPM). As is known going back to before I was born, the Linear Probability Model has some issues.

LAST WEEK IN THE WALL STREET JOURNAL an article LINK talked about how pundits can strategically make probabilistic forecasts. It seems 40% is a sort of magic number, where it’s high enough that if the event comes true you can claim credit as a forecaster, but if it doesn’t happen, you still gave it less than 50/50 odds.
Since I’m often asked to make forecasts I’m interested in this problem.

WE ARE LATE FOR HALLOWEEN, but let’s get out our broom and purrr as we tidy some statistical results.
Today I had occasion to be reminded of competing risks and a handy statistical result on competing risks from A.P. Basu and J.K. Ghosh published in the Journal of Multivariate Analysis in 1978. The paper Identifiability of the multinormal and other distributions under competing risks model showed an analytical result on the distribution of a variable Z which is the minimum of two Gaussian (Normal) random variables.

I PUT TOGETHER SOME SLIDES SUMMARIZING our recent work on dynamic model averaging.
See here and here for more blah blah blah.
See below for some slides.
Click here for a fullscreen version here.
Making the Preso
Let me also share with you the R code I used to generate these slides. The code below is the Rmarkdown I used to generate the slides (saved as .txt). The slides were put together using the xaringan package.

BACK WE GO INTO THE VASTY DEEP. LAST TIME we introduced the idea of using dynamic model averaging to forecast recessions. I was so excited about the new approach that I didn’t take the time to break down what was going on with it. In this post we’ll look more closely at what’s happening with the dma packaged when we try to forecast recessions.
Per usual we’ll do it with R and I’ll include code so you can follow along.

HERE THE LITERATURE IS VASTY DEEP. In this post we’ll dip our toes, every so slightly, into the dark waters of macroeconometric forecasting. I’ve been studying some techniques and want to try them out. I’m still at the learning and exploring stage, but let’s do it together.
In this post we’ll conduct an exercise in forecasting U.S. recessions using several approaches. Per usual we’ll do it with R and I’ll include code so you can follow along.

BACK IN JANUARY WE LOOKED AT HOUSING microdata from the American Community Survey Public Microdata that we collected from IPUMS. Let’s pick back up and look at these data some more. Glad you could join us.
Be sure to check out my earlier post for more discussion of the underlying data. Here we’ll pick up where we left off and make some more graphs using R.
Just a quick reminder (read the earlier post for all the details), we have a dataset that includes household level observations for the 20 largest metro areas in the United States for 2010 and 2015 (latest data available).

THIS PAST MONTH HAS BEEN BUSY. People have been traveling, I’ve been traveling, kids have been sick, and we’ve had the March Madness basketball keeping me occupied. Today I wanted to just explore a little analysis I’ve put together on resampling.
Because reasons I’ve recently been interested in sample sizes and how quickly certain estimates might converge.
There is of course, a vast literature on this topic. But armed with powerful computers maybe we can avoid too much mathy work and try to simulate our way through some problems.