Charting Jobs Friday with R

2018/03/11

economy / dataviz / R

LAST FRIDAY WAS JOBS FRIDAY, the day when the U.S. Bureau of Labor Statistics (BLS) releases its monthly employment situation report. This report is blanketed with media coverage and economist and financial analysts all over the world pay close attention to the report. The employment situation gives a read on trends in the world’s largest economy’s labor market. It also provides a clue about how monetary policy might unfold, affecting bond yields around the world.

In this post I want to share some code to plot trends from the report with R. Note that true Jobs Friday aficionados will have to modify this code, because the data is only updated with a lag. With so many folks dissecting the reports, speed might be desired and so you’d have to modify this approach to have the freshest charts ready to go in time to join the resonance room.

Data

We’ll use data form two sources. As we have in recent posts (see for example here), we’ll gather data via FRED. See here for more on using the quantmod and tidyquant packages to work with FRED data.

But we’ll also go straight to the BLS webpage BLS.gov and grab data directly from some flat files they provide. That will be similar to what we did in this post on the JOLTS data. Certain data we want isn’t available in FRED (at least not that I could find). Fortunately the text files from BLS are easy to work with.

Get data

The code below will get our data from FRED. This is a straightforward application of the tidyquant approach, see the posts above for more details.

#####################################################################################
## Step 1: Load Libraries ###
#####################################################################################
library(tidyverse)
library(tidyquant)
library(scales)
library(tibbletime)
library(data.table)
library(cowplot)



#####################################################################################
## Step 2: go get data ###
#####################################################################################

# Set up tickers
tickers<- c("PAYEMS", # nonfarm payroll employment
            "UNRATE", # unemployment rate
            "CIVPART", # civilian labor force pariticipation rate
            "EMRATIO", # employment-to-population ratio
            "NROU"  ) # estimate of natural rate of unemployment from U.S. Congressional Budget Office
mynames <- c("Nonfarm Payroll Employment",
            "Unemploymen Rate",
            "Labor Force Participation Rate",
            "Employment-to-Population Ratio",
            "Natural Rate of Unemployment")

mytickers<- data.frame(symbol=tickers,varname=mynames,  stringsAsFactors =FALSE)


# download data via FRED 
df<-tq_get(tickers,                         # get selected symbols
            get="economic.data",             # use FRED
            from="1948-01-01")               # go from 1954 forward

df <- left_join(df, mytickers, by="symbol")


#####################################################################################
## Step 3: get data ready for analysis ###
#####################################################################################

df %>% select(-varname) %>% 
  spread(symbol,price) -> df2


# Convert quarterly naturla rate (NROU) data to monthly data by "filling down" using na.locf

df2 %>% 
  mutate(NROU2=na.locf(NROU,na.rm=F)) %>% 
  mutate(UGAP2=UNRATE-NROU2,
         dj=c(NA,diff(PAYEMS)),
         # create indicators for shaded plot
         up=ifelse(UNRATE>NROU2,UNRATE,NROU2),
         down=ifelse(UNRATE<NROU2,UNRATE,NROU2)) -> df2

# Set up recession indicators

recessions.df = read.table(textConnection(
  "Peak, Trough
  1948-11-01, 1949-10-01
  1953-07-01, 1954-05-01
  1957-08-01, 1958-04-01
  1960-04-01, 1961-02-01
  1969-12-01, 1970-11-01
  1973-11-01, 1975-03-01
  1980-01-01, 1980-07-01
  1981-07-01, 1982-11-01
  1990-07-01, 1991-03-01
  2001-03-01, 2001-11-01
  2007-12-01, 2009-06-01"), sep=',',
  colClasses=c('Date', 'Date'), header=TRUE)

Now we can make some plots.

Employment growth

Let’s start by looking at monthly nonfarm job gains.

#####################################################################################
## Step 4: make some plots ###
#####################################################################################


ggplot(data=filter(df2,year(date)>1949), 
       aes(x=date,y=dj,
           color=ifelse(dj>0,"up m/m", 
                        ifelse(dj==0,"no change", "down m/m")),
           fill=ifelse(dj>0,"up m/m", 
                       ifelse(dj==0,"no change", "down m/m"))))+
  geom_col(alpha=0.85,color=NA)+
  #eom_rug()+
  scale_y_continuous(labels=scales::comma,sec.axis=dup_axis())+
  theme_minimal()+
  scale_color_manual(values=c("#d73027","#4575b4"),
                     name="Monthly change")+
  scale_fill_manual(values=c("#d73027","#4575b4"),
                    name="Monthly change")+
  geom_rug(sides="b")+
  scale_x_date(lim=as.Date(c("1950-01-01","2018-12-31")),date_breaks="5 years",date_labels="%Y")+
  #scale_x_date(date_labels="%Y",date_breaks="1 year")+
  theme(legend.position="none",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(size=14,face="italic",color="darkgray"),
        plot.title=element_text(size=14,face="bold",color="black"))+
  labs(x="",y="",
       title="89 consecutive months of positive month-over-month job growth",
       subtitle="Monthly change in U.S. nonfarm employment (1000s, SA)",
       caption="@lenkiefer Source: U.S. Bureau of Labor Statistics")

Unemployment rate

Now let’s chart the unemployment rate relative to the Congressional Budget Office’s estimate of the long-term natural rate of unemployment.

ggplot(data=filter(df2,!is.na(NROU2)),aes(x=date,y=UNRATE))+
  geom_rect(data=recessions.df, inherit.aes=F, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='darkgray', alpha=0.5) +
  geom_line(color="black")+
  geom_line(linetype=2,aes(y=NROU2))+
  geom_ribbon(aes(ymin=UNRATE,ymax=down),fill="#d73027",alpha=0.5)+
  geom_ribbon(aes(ymin=UNRATE,ymax=up),fill="#4575b4",alpha=0.5)  +
  scale_x_date(date_breaks="5 years",date_labels="%Y")+
  scale_y_continuous(sec.axis=dup_axis())+
  theme_minimal(base_size=8)+
  theme(legend.position="top",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(face="italic"),
        plot.title=element_text(size=16,face="bold"))+
  labs(x="",y="Percent",
       title="U.S. Unemployment Rate vs Natural Rate of Unemployment",
       subtitle="Solid line Unemployment Rate, dotted line Long-term Natural Rate of Unemployment",
       caption="@lenkiefer Data Source: U.S. Bureau of Labor Statistics, U.S. Congressional Budget Office,shaded bars NBER Recessions\nNatural Rate of Unemployment (Long-Term) retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/NROU, March 10, 2018")+
  geom_rug(aes(color=ifelse(UNRATE<=NROU2,"Below or Equal","Above")),sides="b")+
  scale_color_manual(values=c("#d73027","#4575b4"),name="Unemployment Rate Above/Below Natural Rate ")

The unemployment rate is below the CBO’s estimate natural rate of unemployment, but wage growth has been tepid. What’s going on? Let’s look at the labor force participation rate and the employment-to-population ratio

ggplot(data=df2, aes(x=date,y=CIVPART,label=CIVPART))+
  geom_rect(data=recessions.df, inherit.aes=F, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='darkgray', alpha=0.5) +
  geom_line(size=1.05)+theme_minimal()+
  geom_point(data=filter(df2,date==max(dfp$date)),size=2,alpha=0.75)+
  geom_text(data=filter(df2,date==max(dfp$date)),fontface="bold",size=4,nudge_y=.15)+
  scale_x_date(date_breaks="1 years",date_labels="%Y")+
  scale_y_continuous(sec.axis=dup_axis())+
  theme(legend.position="none",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(face="italic"),
        plot.title=element_text(size=16,face="bold"))+
  labs(x="",y="",title="Labor Force Participation Rate",
       subtitle="in percentage points (seasonally adjusted)",
       caption="@lenkiefer Source: U.S. Bureau of Labor Statistics, shaded bars NBER Recessions\nCivilian Labor Force Participation Rate [CIVPART], retrieved from FRED, Federal Reserve Bank of St. Louis; \nhttps://fred.stlouisfed.org/series/CIVPART, March 11, 2018.")+
  coord_cartesian(xlim=as.Date(c("2000-01-01","2018-03-01")),ylim=c(62,68))

ggplot(data=df2, aes(x=date,y=EMRATIO,label=EMRATIO))+
  geom_rect(data=recessions.df, inherit.aes=F, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='darkgray', alpha=0.5) +
  geom_line(size=1.05)+theme_minimal()+
  geom_point(data=filter(df2,date==max(dfp$date)),size=2,alpha=0.75)+
  geom_text(data=filter(df2,date==max(dfp$date)),fontface="bold",size=4,nudge_y=.25)+
  scale_x_date(date_breaks="1 years",date_labels="%Y")+
  scale_y_continuous(sec.axis=dup_axis())+
  theme(legend.position="none",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(face="italic"),
        plot.title=element_text(size=16,face="bold"))+
  labs(x="",y="",title="Employment-to-Population Ratio ",
       subtitle="in percentage points (seasonally adjusted)",
       caption="@lenkiefer Source: U.S. Bureau of Labor Statistics, shaded bars NBER Recessions\nEmployment-Population Ratio [EMRATIO], retrieved from FRED, Federal Reserve Bank of St. Louis;\nhttps://fred.stlouisfed.org/series/EMRATIO, March 11, 2018.")+
  coord_cartesian(xlim=as.Date(c("2000-01-01","2018-03-01")),ylim=c(58,65))

The participation ratio is partially affected by an aging population. It might better to restrict our attention to the prime working age population, those ages 25 to 54. We can get that data from FRED but I want to compare Men and Women. I couldn’t find that data in FRED, but the BLS has it. Let’s go get it from a flat file.

There is a file containing all the series_id for all the variables the BLS provides. It’s a large file, but I just want to find Participation Rates, so I’ll use a regular expression and some variable codes

#####################################################################################
## Step 4: get data direct from BLS ###
#####################################################################################

dfs<-fread("https://download.bls.gov/pub/time.series/ln/ln.series")

codes<-dfs[grepl("Participation Rate", series_title) &   # use regular expression
             ages_code==33 &                             # only ags 25 to 54
             periodicity_code =="M" &                    # only monthly frequence
             seasonal=="S"                               # only Seasonally adjusted
           ]
codes$var <- c("All","Men","Women")
codes <- select(codes, series_id, series_title, var)

# get all data (large file)
df.all<-fread("https://download.bls.gov/pub/time.series/ln/ln.data.1.AllData")

# filter data
dfp<-df.all[series_id %in% codes$series_id,]
#create date variable
dfp[,month:=as.numeric(substr(dfp$period,2,3))]
dfp$date<- as.Date(ISOdate(dfp$year,dfp$month,1) ) #set up date variable

dfp$v<-as.numeric(dfp$value)
# join on variable names, drop unused variables, convert to data.table
left_join(dfp, select(codes, series_id,series_title,var), by="series_id") %>%  select(series_id,series_title,var,date,v) %>% data.table() -> dfp

Let’s take a look at our cleaned up data.

str(dfp)

## Classes 'data.table' and 'data.frame':   2526 obs. of  5 variables:
##  $ series_id   : chr  "LNS11300060" "LNS11300060" "LNS11300060" "LNS11300060" ...
##  $ series_title: chr  "(Seas) Labor Force Participation Rate - 25-54 yrs." "(Seas) Labor Force Participation Rate - 25-54 yrs." "(Seas) Labor Force Participation Rate - 25-54 yrs." "(Seas) Labor Force Participation Rate - 25-54 yrs." ...
##  $ var         : chr  "All" "All" "All" "All" ...
##  $ date        : Date, format: "1948-01-01" "1948-02-01" ...
##  $ v           : num  64.2 64.6 64.3 64.8 64.3 65 65.4 65 65.4 65 ...
##  - attr(*, ".internal.selfref")=<externalptr>

Now we can plot it. Both recently…

ggplot(data=dfp, aes(x=date,y=v,color=var, label=var))+
  geom_rect(data=recessions.df, inherit.aes=F, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='darkgray', alpha=0.5) +
  geom_line(size=1.05)+theme_minimal()+
  geom_point(data=filter(dfp,date==max(dfp$date)),size=2,alpha=0.75)+
  geom_text(data=filter(dfp,date==max(dfp$date)),fontface="bold",size=4,nudge_y=1)+
  scale_x_date(date_breaks="1 years",date_labels="%Y")+
  scale_y_continuous(sec.axis=dup_axis())+
  theme(legend.position="none",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(face="italic"),
        plot.title=element_text(size=16,face="bold"))+
  labs(x="",y="",title="Labor Force Participation Rate: Prime Working Age (25-54)",
       subtitle="in percentage points (seasonally adjusted)",
       caption="@lenkiefer Source: U.S. Bureau of Labor Statistics, shaded bars NBER Recessions")+
  coord_cartesian(xlim=as.Date(c("2000-01-01","2018-03-01")),ylim=c(70,95))

…and over history.

ggplot(data=dfp, aes(x=date,y=v,color=var, label=var))+
  geom_rect(data=recessions.df, inherit.aes=F, aes(xmin=Peak, xmax=Trough, ymin=-Inf, ymax=+Inf), fill='darkgray', alpha=0.5) +
  geom_line(size=1.05)+theme_minimal()+
  geom_point(data=filter(dfp,date==max(dfp$date)),size=2,alpha=0.75)+
  geom_text(data=filter(dfp,date==max(dfp$date)),fontface="bold",size=4,nudge_y=2)+
  scale_x_date(date_breaks="5 years",date_labels="%Y")+
  scale_y_continuous(sec.axis=dup_axis())+
  #scale_color_mycol(palette="main",discrete=T,name="Labor force participation rate (%)")+
  theme(legend.position="none",
        plot.caption=element_text(hjust=0),
        plot.subtitle=element_text(face="italic"),
        plot.title=element_text(size=16,face="bold"))+
  labs(x="",y="",title="Labor Force Participation Rate: Prime Working Age (25-54)",
       subtitle="in percentage points (seasonally adjusted)",
       caption="@lenkiefer Source: U.S. Bureau of Labor Statistics, shaded bars NBER Recessions")

Quite a lot of stories there.

Charting Jobs Friday with R

Chart Jobs Friday using R.

Data

Get data

Employment growth

Unemployment rate

Share!