LET’S TAKE A LOOK AT RECENT TRENDS IN CONSUMER PRICES AND HOUSEHOLD DEBT. Along the way we’ll refresh some visualizations of consumer prices (see here) and household debt (see here) we made last year, as well as think up some new ones.
As usual we’ll use R to generate the plots and I’ll share the code below. But before we get into the details of constructing the charts, let’s just look at two plots to help set the stage.
Two possibly related charts
First, let’s examine trends in U.S. consumer prices since the year 2000. We’ll get into the data details later, but these data are from the U.S. Bureau of Labor Statistics BLS.gov. The Consumer Price Index (CPI) LINK tracks “changes in the prices paid by urban consumers for a representative basket of goods and services”.
These data are normalized so that January 2000 is equal to 100. The trend lines compare how the seasonally-adjusted price indices have evolved since then. In case you didn’t notice, I’ve made the education line thicker to call attention to it. From January 2000 to April 2017 (latest data available) the CPI for all items was up 44 percent (144 in the chart) vs 129 percent (229 in the chart) for the education index.
There are important niceties see again the BLS about how the index is constructed and what it really measures. The important point is education costs are up a lot more than overall prices since 2000.
On a possibly related note, consider this chart showing household debt trends using data from the New York Federal Reserve Bank/Equifax’s consumer credit panel LINK.
Got it? Price up, debt up. Let’s start building some more visualizations of these data and explore these trends in more detail.
Step 1: get the data
The data for these plots are conveniently made available by the BLS and the New York Fed.
A quick note on data manipulation. I have been cycling between using a data.table based approach for data wrangling and a dplyr/tidyverse based approach. Both seem to work. For the CPI data I’ll mostly use data.table and for the household debt data I’ll mostly use dplyr.
Consumer price data
First, we’ll follow my CPI post to get the Consumer Price data.
Household debt data
The household debt data is available in an Excel spreadsheet. We’ll have to use readxl (see this post for more on wrangling Excel data with R). Let’s start by assuming that you have downloaded the latest credit panel summary data (available here .xlsx file) and saved it as HHD_C_Report_2017Q1.xlsx in a data directory.
Now in order to get this data useable, we have to use readxl to reach into the files and extract the useful information. I had originally been doing some manual manipulation in Excel, but ewww gross, let’s try to do it all with R.
It’s going to involve understanding the structure of the Excel file and using some of readxl’s nice features.
This spreadsheet is-sigh-well, set up like spreadsheets are wont to be. The data we want is in the “Page 3 Data” sheet, organized with variables in rows and data in columns. The row titles we want are not contiguous with the data (there’s a gap of a couple columns). We’ll have to import the data and try to beat it into tidy submission so we can actually work with it.
Hey, not so bad. Let’s make a table looking at the last quarter of data (2017 Q1).
Aggregate Household Debt ($ Trillions, 2017 Q1)
Source: FRBNY Consumer Credit Panel/Equifax
Step 2: make some awesome charts
Now that we’ve got our data, we can start making some awesome charts. Or maybe I should say continue making awesome charts, because we never really stop do we?
Let’s get to it.
Venerable line charts
We’ll get crazy I promise but let’s start simple with the venerable line chart. The code below will recreate the plots we started with.
CPI line chart
Household debt line chart
Inflation line chart
The CPI line chart compares price changes over a long horizon. Let’s focus on more recent changes. The code below generates a plot of monthly price inflation by category.
Wow, that’s pretty ugly, transportation is volatile and it’s hard to see what’s going on in the plot. Let’s use faceting to make a small multiple (also, let’s drop volatile transportation/commodities and recreation so we have 6 categories). Let’s also zoom in on trends since 2010.
Getting better. Still quite a bit of noise in some of the monthly measures. Let’s try looking at year-over-year percent changes to smooth out some of that noise.
Now we have a clearer picture of recent trends. Overall inflation has been rising over the past year or so (thanks largely to recovering energy prices). Both Medical Care and Education are rising at a level well above general inflation, but seem to be trending down. Housing, which measures rents and owner’s equivalent rent not house prices (see for example question 2 here), is on the upswing.
Let’s build some horizon charts (see my horizon chart post for everything you want, everything you need) to analyze inflation trends. Making horizon charts was only possible due to code from @timelyportfolio, lots of cool stuff over at his page.
These widgets are cool, but I’m not doing much with them here. We’ll do more later.
Lollipop charts have been under fire recently, but I still like them. See here for a more spirited defense of the chart type from Andy Cotgreave.
We’ll just go ahead and make one. If you don’t like them, the nice thing about ggplot2 is it’s pretty easy to just swap in a bar.
Let’s make one.
This plot compares how much prices have risen relative to 2000. Essentially, this is just the endpoint of the line plot.
State debt trends
Let’s move over to the household debt data and look at state debt trends. In the same data file (different) worksheet we got from the New York Fed we can get a time series of per capita debt levels for the U.S. and some large states.
Debt levels are starting to rise, but remain (on a per capita basis) below pre-Great Recession levels.
State delinquency trends
Let’s look at consumer delinquency trends by state. These data are in another worksheet in the same spreadsheet. Our mastery of a couple readxl tricks is paying off bigtime!
After rising very high during the Great Recession and its aftermath, delinquency rates are coming down. Let’s try to rework this line chart as a dot chart.
State delinquency dot chart
First, we’ll compute the max and min delinquency rate for each state, and then plot the 2017Q1 value as a dot with markers to indicate the min/max. We’ll also add the U.S. state="ALL"" in our data as another reference. I’ve done these calculations before, usually the data.table way, but we’ll do it the dplyr way here.
Improving the plot with reordering
Ooh, I learned something else I should share. The data below are ordered according alphabetically, which isn’t the best way. Let’s reorder the data based on the current value. We can do it easily thanks to the forcatsfct_reorder function .
In this plot we can compare more easily than with the small multiple lines. Notice that the range of variation is quite different depending on which state you are looking at. Nevada (NV) had delinquency rates reach over 20 percent. Ohio (OH) on the other hand, never saw delinquency rates over 10 percent. We can also see that while the level of delinquency has declined, it’s not at a minimum for any of the states tracked.
These data lend themself to animations.
Here’s some animated gifs of some of the plots:
Time to wrap up for today. We refreshed some old charts and introduced a couple new ones. We did some data wrangling with data.table() and dplyr. We used readxl to reach into Excel files and get our data. And we tried out forcats to reorder labels in our chart. All this enabled us wrangle our data and get it ready to use ggplot2 to visualize trends in consumer prices and household debt.
Hope you found this useful. How could these code snippets and data visualizations work for you?