The Linear Probability Model (LPM) might be bad, but is it all bad? Let’s look at some conditions where the LPM might not be so bad. We’ll also look at some simple adjustments that might improve the performance of the LPM. We’ll also compare the LPM to some common alternatives. Setup Throughout most of this post, we’re going to consider a world where the LPM model is the true model. That is:
I try not to use too much jargon (jargon monoxide can be deadly) on this blog. But I’ve got a bit of a technical term I’ve been using the describe U.S. residential construction: super-low. To be sure, housing construction has been grinding higher, but it’s been taking a while for activity to get back close to historical averages. Once you account for the larger population, which all else equal needs more housing units, the level of construction is quite low.
I think a lot about predicting/forecasting binary outcomes. Will the economy head into a recession next year? What’s the likelihood of a loan defaulting over the next few years? Will my followers on social media abandon me if I tweet about my lunch? One often maligned, but seemingly irresitable approach to modeling binary ourcomes is the Linear Probability Model (LPM). As is known going back to before I was born, the Linear Probability Model has some issues.
I am headed out west, to California to talk housing at the Western Secondary Market Conference. After my talk they might post my slides online somewhere. If they do I’ll link to them, but for now you can get a preview in this twitter thread. Like many western states, California is facing a imbalance between housing supply and housing demand. Strong economic growth has bolstered demand, but supply has not kept up.
I decided to switch over my blog theme. The Ghostwriter theme I used was nice, but it didn’t have a blog archive. As the number of posts grow a blog archive is easier to search. We still have tags you can search. I’ve adopted the Hugo Blackburn theme. This is the same theme used over at the Simply Statistics blog. If you drop by that blog, check out this essay by Roger Peng with some perspective on the evolution of R.
Let’s compare two charts. “Your chart”, or a chart that might come virtually unedited from spreadsheet software versus the chart your boss told you not to worry about: Your chart is perfectly serviceable and for a quick exploration might be perfectly fine. However, why routinely generate such charts if you have the ability to make something a bit more dynamic? Being able to produce more interesting charts might not be necessary, but it also probably doesn’t hurt.
In this post I want to share some observations on housing in the United States from 1980 to 2016, share some R code for data wrangling, and tri (no that’s not a typo, just a pun) out a visualization techniques. Let’s get to it. I’ve been carrying a running conversation with folks on Twitter regarding the U.S. housing market and its future. Much of that depends on the evolution of demographic forces.
In this post we will create some plots of house prices and incomes for the United States and individual states. We will also try out the bea.R package to get data from the U.S. Bureau of Economic Analysis. We’ll end up with something like this: Per usual we’ll do it with R and I’ll include code so you can follow along. Data We’re going to use two sources of data. First, we’ll get the FHFA house price index and then we’ll get per capita income estimates from the United States Bureau of Economic Analysis (BEA).
As an economist with a background in econometrics and forecasting I recognize that predictions are often (usually?) an exercise in futility. Forecasting, after all, is hard. While non-economists have great fun pointing this futility out, many critics miss out on why it’s so hard. There are at least two reasons why forecasting is hard. The first, the unknown future, is pretty well understood. Empirical regularities with much forecasting power in the social sciences are hard to come by and are rarely stable.
In the real world, when I give talks and use slides I am typically constrained in my aesthetic. Often I’m speaking at a work-related thing and we have a corporate template and color scheme. They serve us well and I’ve found restraint helps focus on the message. Usually I’m setting out to inform, so direct, repeatable and easy to follow are key. But I also like to explore new ideas and different themes on the side.