I think a lot about predicting/forecasting binary outcomes. Will the economy head into a recession next year? What’s the likelihood of a loan defaulting over the next few years? Will my followers on social media abandon me if I tweet about my lunch?
One often maligned, but seemingly irresitable approach to modeling binary ourcomes is the Linear Probability Model (LPM). As is known going back to before I was born, the Linear Probability Model has some issues.
I am headed out west, to California to talk housing at the Western Secondary Market Conference. After my talk they might post my slides online somewhere. If they do I’ll link to them, but for now you can get a preview in this twitter thread.
Like many western states, California is facing a imbalance between housing supply and housing demand. Strong economic growth has bolstered demand, but supply has not kept up.
I decided to switch over my blog theme. The Ghostwriter theme I used was nice, but it didn’t have a blog archive. As the number of posts grow a blog archive is easier to search. We still have tags you can search.
I’ve adopted the Hugo Blackburn theme. This is the same theme used over at the Simply Statistics blog. If you drop by that blog, check out this essay by Roger Peng with some perspective on the evolution of R.
Let’s compare two charts. “Your chart”, or a chart that might come virtually unedited from spreadsheet software versus the chart your boss told you not to worry about:
Your chart is perfectly serviceable and for a quick exploration might be perfectly fine. However, why routinely generate such charts if you have the ability to make something a bit more dynamic? Being able to produce more interesting charts might not be necessary, but it also probably doesn’t hurt.
In this post I want to share some observations on housing in the United States from 1980 to 2016, share some R code for data wrangling, and tri (no that’s not a typo, just a pun) out a visualization techniques.
Let’s get to it.
I’ve been carrying a running conversation with folks on Twitter regarding the U.S. housing market and its future. Much of that depends on the evolution of demographic forces.
In this post we will create some plots of house prices and incomes for the United States and individual states. We will also try out the bea.R package to get data from the U.S. Bureau of Economic Analysis.
We’ll end up with something like this:
Per usual we’ll do it with R and I’ll include code so you can follow along.
Data
We’re going to use two sources of data.
As an economist with a background in econometrics and forecasting I recognize that predictions are often (usually?) an exercise in futility. Forecasting, after all, is hard. While non-economists have great fun pointing this futility out, many critics miss out on why it’s so hard.
There are at least two reasons why forecasting is hard. The first, the unknown future, is pretty well understood. Empirical regularities with much forecasting power in the social sciences are hard to come by and are rarely stable.
In the real world, when I give talks and use slides I am typically constrained in my aesthetic. Often I’m speaking at a work-related thing and we have a corporate template and color scheme. They serve us well and I’ve found restraint helps focus on the message. Usually I’m setting out to inform, so direct, repeatable and easy to follow are key.
But I also like to explore new ideas and different themes on the side.
IN MY LINE OF WORK, (finance/economics) you see a lot of dual axis line charts. I am of the opinion that dual y axis charts are sort of evil. But in this post I’m going to make one. It’s for a totally legit reason though.
Like in an earlier post we’ll make a graph similar to one I saw on xenographics. Xenographics gives examples of and links to “weird, but (sometimes) useful charts”.
TODAY WAS JOBS FRIDAY. LET’s create a couple plots to show the trend in employment growth.
Each month the U.S. Bureau of Labor Statistics (BLS) releases its employment situation report. Let’s make a couple plots looking at trends in U.S. nonfarm payrolls.
Per usual, let’s make a graph with R.
Data
We can easily get the data via the Saint Louis Federal Reserve’s FRED database. If you followed my post from back in April of last year you know what we can do if we combine FRED with the quantmod package.