Real House Price Trends

Analyzing real house price trends with R

Let us take a look at house price trends in the United States and across states and metro areas.

Earlier this week I tweeted out a few charts on housing market trends.

In this post we’ll analyze real house prices since 1975, and per usual use R to wrangle data and make plots. This code will replicate most of the charts I shared. In some of my tweets I used Tableau for maps (it was close at hand), but we’ll do it all with R here.

Code for the plots will be hidden in the details tabs.

Setup

Let’s review what we need to do to get started.

Mapping

To make maps, we’ll use the Urban Institute’s urbnmapr package library. When we will map metro (MSA) data, we’ll need to get the counties for each MSA via Census. The Freddie Mac House Price Index hasn’t yet been changed to the latest CBSA definitions so we will need to use older definitions. Though I work at Freddie Mac, I don’t work directly on the house price index.

Data

We will use two sources of data. We will get house prices from the Freddie Mac House Price Index available through May 2019. We will deflate the price indices by the U.S. Bureau of Labor Statistics, Consumer Price Index for All Urban Consumers: All items less shelter [CUUR0000SA0L2], which we can get via FRED.

Load Libraries, Wrangle Data

#####################################################################################
## Load libraries ----
#####################################################################################
suppressPackageStartupMessages({
library(data.table)
library(tidyverse)
library(urbnmapr)
library(colorblindr)
library(geofacet)
library(readxl)
library(httr)}
)

#####################################################################################
# wrangle data ----
#####################################################################################

# download house price data
dt <- fread("http://www.freddiemac.com/fmac-resources/research/docs/fmhpi_master_file.csv")
# create date
dt[,date:=as.Date(ISOdate(Year,Month,1))]
# get cpi data (via FRED)
df_cpi <- tidyquant::tq_get("CUUR0000SA0L2",get="economic.data",from="1975-01-01")
dt <- merge(dt,df_cpi,by="date")

# crease real house price index by dividing by first observation (Jan 1975) and scaling by 100
dt[, ":="(real_hpi=100*Index_SA/first(Index_SA) / (price/first(price))),
   by=c("GEO_Name","GEO_Type") # make sure to use by
   ]

dt2 <- 
  dt[date==max(date) & GEO_Type=="State",]  %>% 
  left_join(states,by=c("GEO_Name"="state_abbv")) %>%
  data.table()

# get delineation file (use April 2018 version)
url1 <- "https://www2.census.gov/programs-surveys/metro-micro/geographies/reference-files/2018/delineation-files/list1.xls"
GET(url1, write_disk(tf <- tempfile(fileext = ".xls")))
df <- read_excel(tf,skip=2) #read in data from third row
# get rid of nasty spacesess in colnames
colnames(df) <- gsub('([[:punct:]])|\\s+','_',names(df))

# merge data 
dt[date==max(date) & GEO_Type=="CBSA",] %>% 
  left_join(df, by=c("GEO_Code"="CBSA_Code")) %>% 
  # on right assignment
  # http://www.win-vector.com/blog/2016/12/the-case-for-using-in-r/comment-page-1/
  data.table() -> dt3  

Other Packages

I’ll also use the recently updated cowplot package and the colorblindr package for the scale_color_OkabeIto and scale_fill_OkabeIto functions. Both of these packages are via Claus Wilke, the author of the excellent Fundamentals of Data Visualization.

If you have trouble with installing colorblindr, you can drop the scale_color_OkabeIto and replace them with something else and the rest of the code should work.

Custom Color Scales

Finally, we’ll create a custom color scale as described by [at]drsimonj in this post Creating corporate colour palettes for ggplot2.

Code for custom color scales

# Function for colors ----
# adapted from https://drsimonj.svbtle.com/creating-corporate-colour-palettes-for-ggplot2
#####################################################################################
## Make Color Scale ----  ##
#####################################################################################
my_colors <- c(
  "green"      = rgb(103,180,75, maxColorValue = 256),
  "green2"      = rgb(147,198,44, maxColorValue = 256),
  "lightblue"  =  rgb(9, 177,240, maxColorValue = 256),
  "lightblue2" = rgb(173,216,230, maxColorValue = 256),
  'blue'       = "#00aedb",
  'red'        = "#d11141",
  'orange'     = "#f37735",
  'yellow'     = "#ffc425",
  'gold'       = "#FFD700",
  'light grey' = "#cccccc",
  'purple'     = "#551A8B",
  'dark grey'  = "#8c8c8c")


my_cols <- function(...) {
  cols <- c(...)
  if (is.null(cols))
    return (my_colors)
  my_colors[cols]
}


my_palettes <- list(
  `main`  = my_cols("blue", "green", "yellow"),
  `cool`  = my_cols("blue", "green"),
  `cool2hot` = my_cols("lightblue2","lightblue", "blue","green", "green2","yellow","gold", "orange", "red"),
  `hot`   = my_cols("yellow", "orange", "red"),
  `mixed` = my_cols("lightblue", "green", "yellow", "orange", "red"),
  `mixed2` = my_cols("lightblue2","lightblue", "green", "green2","yellow","gold", "orange", "red"),
  `mixed3` = my_cols("lightblue2","lightblue", "green", "yellow","gold", "orange", "red"),
  `mixed4` = my_cols("lightblue2","lightblue", "green", "green2","yellow","gold", "orange", "red","purple"),
  `mixed5` = my_cols("lightblue","green", "green2","yellow","gold", "orange", "red","purple","blue"),
  `mixed6` = my_cols("green", "gold", "orange", "red","purple","blue"),
  `grey`  = my_cols("light grey", "dark grey")
)


my_pal <- function(palette = "main", reverse = FALSE, ...) {
  pal <- my_palettes[[palette]]
  
  if (reverse) pal <- rev(pal)
  
  colorRampPalette(pal, ...)
}


scale_color_mycol <- function(palette = "main", discrete = TRUE, reverse = FALSE, ...) {
  pal <- my_pal(palette = palette, reverse = reverse)
  
  if (discrete) {
    discrete_scale("colour", paste0("my_", palette), palette = pal, ...)
  } else {
    scale_color_gradientn(colours = pal(256), ...)
  }
}



scale_fill_mycol <- function(palette = "main", discrete = TRUE, reverse = FALSE, ...) {
  pal <- my_pal(palette = palette, reverse = reverse)
  
  if (discrete) {
    discrete_scale("fill", paste0("my_", palette), palette = pal, ...)
  } else {
    scale_fill_gradientn(colours = pal(256), ...)
  }
}

Discussion

The trends in real house prices are persistent and reflect long-term trends in economic growth. Housing markets in inelastically supplied areas (like the Bay Area) are even more sensitive to economic growth. Areas with strong income growth but inelastic housing supply experience rapid increases in real house prices as more competition over a relativley fixed supply of housing bids up values.

 Share!