WE ARE ON OUR WAY TOWARDS BUILDING a tidy PowerPoint workflow. In this post I want to build on my earlier posts (see here for an introduction and here for a more sophisticated approach) for building a PowerPoint presentation with R and try to make it even purrrtier.

I saw that somebody shared my posts on reddit and I thought I would take a look at the comments. Folks on the internet are known for kindness and offering helpful advice right?

The comments on the post were pretty constructive. One issue brought up was that the slides we made suffered because “The figures look all stretched out and not all that professional.” And indeed they were. I was focusing on the workflow and grabbed some images I had handy that weren’t even sized properly for the slides I was using.

We can fix that by using some other tools to export high quality images into PowerPoint.

Using Office R with rvg

We can combine officer with the rvg package to export R graphics as high quality vector graphics into our PowerPoint. These images have the added advantage that we can edit them within PowerPoint, should we choose to do so.

Making Purrrtier PowerPoint

Here’s the plan: We’ll go grab some data, then make some nice plots and then we’ll insert them into PowerPoint as vector graphics.

Get some data

For this example, we’ll plot U.S. house price trends at the state level. We’ll go to the FHFA webpage and download purchase-only house price indices for all 50 states plus the District of Columbia and plot the 4-quarter percent change in prices.

#####################################################################################
## Load libraries ##
#####################################################################################
library(data.table)
library(ggridges)
library(tidyverse)
library(tidyquant)
library(officer)
library(rvg)
library(viridis)
library(scales)

#####################################################################################
## Get house price data
#####################################################################################
#read in data available as a text file from the FHFA website:
fhfa.data<-fread("http://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_PO_state.txt")
fhfa.data[,date:=as.Date(ISOdate(yr,qtr*3,1))] #make a date 
df.hpi<-fhfa.data %>% 
  group_by(state) %>% 
  mutate(hpa=Delt(index_nsa,k=4)) %>%   # create 4-quarter % change in index
  ungroup()

# Let's a take a look
knitr::kable(tail(df.hpi %>% arrange(date,state) %>% select(date,state,hpa) %>% map_if(is.numeric,percent) %>% data.frame(),10), row.names = FALSE)
date state hpa
2017-06-01 SD 5.3%
2017-06-01 TN 7.6%
2017-06-01 TX 8.2%
2017-06-01 UT 9.2%
2017-06-01 VA 4.4%
2017-06-01 VT 3.6%
2017-06-01 WA 12.4%
2017-06-01 WI 6.2%
2017-06-01 WV -1.1%
2017-06-01 WY 0.9%

Make a deck

Let’s imagine that our goal is to make a 51 page slidedeck with one chart on each page. And we’d like the resolution on the charts to be good. No problem, we can use our strategy from before along with the rvg package to get the job done.

Make some functions

To make our life easier, it will be helpful to make some simple functions. We’re going to create two functions. One simple function will filter our data given a state. The second function will take an input dataset and plot it.

# Get a list of states, will be useful later
s.list<-unique(df.hpi$state)

#####################################################################################
## Filter dataset function
#####################################################################################

f.state<-function(s="CA"){
  filter(df.hpi,state==s) %>% map_if(is.character,as.factor) %>% data.frame()
}

#####################################################################################
## Make plot function##
#####################################################################################

plotf<-function(in.df=f.state("CA")){
  ggplot(data=in.df, aes(x=date,y=hpa))+
    geom_line(data=df.hpi, aes(group=state),alpha=0)+
    theme_minimal()+
    geom_ridgeline_gradient(aes(y=0,height=hpa,fill=hpa),min_height=-1)+
    scale_fill_viridis(option="C",limit=c(-.35,.35))+
    #geom_area(fill="royalblue",alpha=0.35)+
    geom_line(color="black",size=1.1)+
    labs(x="",y="",
         title=paste0("House price index for ",head(in.df,1)$state), 
         subtitle="4-quarter percent change in index, not seasonally adjusted",
         caption="@lenkiefer Source U.S. Federal Housing Finance Agency,  Purchase-only house price index by state")+
    theme(legend.position="none",plot.caption=element_text(hjust=0))+
    scale_y_continuous(labels=scales::percent,sec.axis=dup_axis(), breaks=seq(-1,1,.05))+
    scale_x_date(date_breaks="5 years",date_labels="%Y")
}

Now we can try it out via:

plotf()

Write the deck

Now we proceed as before but instead of plugging in an image file with ph_with_img we’ll plug in a vector graphic with rvg::ph_with_vg_at. Recall that we’re loading a blank PowerPoint template first.

# Load blank.pptx, an empty powerpoint that serves as a template
my_pres<-read_pptx("data/blank.pptx")

# function for adding slides
# updated 11/3/2017 to fix function references (was mp2, should be mp)
myp<- function(i){
  my_pres %>% 
    add_slide(layout = "Blank", master = "Office Theme") %>%
    ph_with_vg_at( code=print(plotf(f.state(s.list[i]))) , 0.1, 0.1, 9.8,7.3) ->
    my_pres
}

# use purrr::walk() to write the files
walk(1:N,myp)

# save the .pptx file
my_pres %>%
  print( target = "hpi.pptx") %>% 
  invisible()

You can download the deck here. Below I’ve embedded a pdf version of the slidedeck.

How can we use this?

If you open up the PowerPoint file you’ll notice that you can edit the images, for good or for evil. You could do some additional editing within PowerPoint. Or you could work with transitions and make a little animation.