Maybe you are of the opinion that charts should have their y axis extend all the way down to 0, even if the data live far away from zero. I’m not sure if that’s always the right thing to do. But if you are strict about this, how can you use the space?
One thing I experimented with in my Mortgage rates in the 21st century post was filling the area under a line with progressively fainter area.
In this post I want to share updated plots comparing house price trends around the world. Or at least part of the world. Our view will be somewhat limited, based on data, but will at least allow us to see how U.S. house prices compare to a few other countries.
The details behind these plots are explained in more detail in this post, but some of the images were lost due to my blog transition.
This tweet turned out to be popular:
👀house price trends👀 pic.twitter.com/JXB5P0H84A
— 📈 Len Kiefer 📊 (@lenkiefer) August 1, 2018 It’s a remix of a chart we made here, though it uses a different index. In the earlier post, we used the FHFA house price index, but this one used the Case-Shiller Index, which was released today.
Let me just post two gifs and then below will be the R code I used to create them.
Let’s pick up where we left off yesterday and do some more exploration with text mining.
Like yesterday we’ll use the tidytext package for R. And we’ll lean heavily on Julie Silge and David Robinson’s Text Mining with R.
Data
We’ll turn again to the Federal Reserve for our text data. But today we’ll explore the Beige Book, which gathers anecdotal information on current economic conditions across the Federal Reserve Districts.
Textmining is an exciting topic. There is tremendous potential to gain insights from textual analysis. See for example Gentzko, Kelly and Taddy’s Text as Data. While text mining may be quite advanced in other fields, in finance and economics the application of these techniques is still in its infancy.
In order to take advantage of text as data, economists and financial analysts need tools to help them. Fortunately, there is a great resource: Text Mining with R by Julia Silge (blog and on Twitter atjuliasilge) and David Robinson (blog and on Twitter atdrob).
Indications are that U.S. housing market activity in the middle part of 2018 has moderated. Home sales estimates for both new home sales and existing home sales declined on a seasonally adjusted basis in June relative to May. House price growth has also moderated recently. Some folks have gotten animated about the recent trends.
I’m more sanguine about the recent data. Certainly a slowdown in housing market activity would be cause for concern.
I saw today, via Ropensci a blog post about a new package for making animated gifs with R called gifski now available on CRAN.
Let’s adapt the code we shared last week to use the gifski package. See that post for additional details.
If we run the R code below we’ll generate this animated plot:
This plot shows the evolution of house prices in two states, California (CA) and Texas (TX) versus the United States (USA).
On Twitter Claus Wilke asks:
Dear Lazyweb: Is there an accepted name for a plot showing a two-variable time series as a path in the x-y plane? #dataviz@Elijah_Meeks @albertocairo @lenkiefer @sharoz @dataandme pic.twitter.com/N8Edmf8qii
— Claus Wilke (@ClausWilke) July 21, 2018 I call them connected scatterplots, and we’ve made a few here. See for example this post.
But we can intensify things and make a plot like this:
hey @ClausWilke why stop at a 2-d connected scatterplot* when you could go to 3-d
The Linear Probability Model (LPM) might be bad, but is it all bad? Let’s look at some conditions where the LPM might not be so bad. We’ll also look at some simple adjustments that might improve the performance of the LPM. We’ll also compare the LPM to some common alternatives.
Setup
Throughout most of this post, we’re going to consider a world where the LPM model is the true model.
I try not to use too much jargon (jargon monoxide can be deadly) on this blog. But I’ve got a bit of a technical term I’ve been using the describe U.S. residential construction: super-low.
To be sure, housing construction has been grinding higher, but it’s been taking a while for activity to get back close to historical averages. Once you account for the larger population, which all else equal needs more housing units, the level of construction is quite low.