22 January 2017

INTERACTIVE DASHBOARDS CAN BE AN EFFECTIVE WAY to explore and present data. Recently, I have been using flexdashboards created with R. Over January 2017 I’ve posted the following examples:

For each of these you can get the code by clicking on the source link in the upper right corner of the visualizations at the respective links. While I tried to include helpful comments in the code it might be hard to build your own from scratch. While the documentation for flexdashboards is good and there are several examples in the gallery you can learn from, I thought I’d take some time to walk through the construction of a new flexdashboard.

The Plan

We’ll build an interactive flexdashboard to explore trends in house prices across several areas. In this example I’m going to try to show you the following:

  • How to set up a multipage dashboard -use a storyboard on one page
  • How to use plotly to create an interactive chart
  • How to combine plotly with crosstalk to add more interactions
  • How to animate a plotly chart

The data

For this project I’m going to revisit the house price data we used in our house price meditations. These house price data allow us to explore data that vary over both space and time, and that have interesting hierarchies we will explore.

While data wrangling is an important subject (see for example, this post on wrangling house price data), I don’t want to distract from the dashboard. For this post, we’ll begin with our data compiled.

The data structure is fairly simple. We have columns corresponding to date, metro name, primary state for the metro area (the state of the metro’s principal city), Census region of the primary state (based on Census definitions) the house price index, and the latitude and longitude of the principal city for the metro area. We’ve also computed the 12-month percent change in the house price index, named hpa12.

We’re using the Freddie Mac House Price Index.

I’ve arranged these data and saved them as a simple csv files.

Here’s how these data look (examining the metros):

House Price Data
date geo statename region hpi lat long hpa12
1 Sep 01,2016 Phoenix-Mesa-Scottsdale, AZ Arizona West Region 177.31 33.54 -112.07 0.06
2 Sep 01,2016 Los Angeles-Long Beach-Anaheim, CA California West Region 242.37 34.11 -118.41 0.07
3 Sep 01,2016 Riverside-San Bernardino-Ontario, CA California West Region 203.1 33.94 -117.4 0.06
4 Sep 01,2016 Sacramento--Roseville--Arden-Arcade, CA California West Region 184.01 38.57 -121.47 0.09
5 Sep 01,2016 San Diego-Carlsbad, CA California West Region 207.68 32.81 -117.14 0.08
6 Sep 01,2016 San Francisco-Oakland-Hayward, CA California West Region 202.34 37.77 -122.45 0.09
7 Sep 01,2016 Denver-Aurora-Lakewood, CO Colorado West Region 181.02 39.77 -104.87 0.11
8 Sep 01,2016 Washington-Arlington-Alexandria, DC-VA-MD-WV District of Columbia South Region 205.34 38.91 -77.02 0.03
9 Sep 01,2016 Miami-Fort Lauderdale-West Palm Beach, FL Florida South Region 215.97 25.78 -80.21 0.1
10 Sep 01,2016 Orlando-Kissimmee-Sanford, FL Florida South Region 170.62 28.5 -81.37 0.1
11 Sep 01,2016 Tampa-St. Petersburg-Clearwater, FL Florida South Region 183.43 27.96 -82.48 0.11
12 Sep 01,2016 Atlanta-Sandy Springs-Roswell, GA Georgia South Region 134.06 33.76 -84.42 0.08
13 Sep 01,2016 Chicago-Naperville-Elgin, IL-IN-WI Illinois Midwest Region 128.3 41.84 -87.68 0.05
14 Sep 01,2016 Boston-Cambridge-Newton, MA-NH Massachusetts Northeast Region 162.05 42.34 -71.02 0.06
15 Sep 01,2016 Baltimore-Columbia-Towson, MD Maryland South Region 178.69 39.3 -76.61 0.03
16 Sep 01,2016 Detroit-Warren-Dearborn, MI Michigan Midwest Region 103.23 42.38 -83.1 0.07
17 Sep 01,2016 Minneapolis-St. Paul-Bloomington, MN-WI Minnesota Midwest Region 143.04 44.96 -93.27 0.05
18 Sep 01,2016 Kansas City, MO-KS Missouri Midwest Region 137.99 39.12 -94.55 0.08
19 Sep 01,2016 St. Louis, MO-IL Missouri Midwest Region 135.58 38.64 -90.24 0.05
20 Sep 01,2016 Charlotte-Concord-Gastonia, NC-SC North Carolina South Region 146.26 35.2 -80.83 0.08
21 Sep 01,2016 Las Vegas-Henderson-Paradise, NV Nevada West Region 150.66 36.21 -115.22 0.09
22 Sep 01,2016 New York-Newark-Jersey City, NY-NJ-PA New York Northeast Region 168.58 40.67 -73.94 0.04
23 Sep 01,2016 Cincinnati, OH-KY-IN Ohio Midwest Region 122.06 39.14 -84.51 0.06
24 Sep 01,2016 Portland-Vancouver-Hillsboro, OR-WA Oregon West Region 212.83 45.54 -122.66 0.13
25 Sep 01,2016 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Pennsylvania Northeast Region 166.12 40.01 -75.13 0.03
26 Sep 01,2016 Pittsburgh, PA Pennsylvania Northeast Region 157.07 40.44 -79.98 0.04
27 Sep 01,2016 Dallas-Fort Worth-Arlington, TX Texas South Region 176.13 32.79 -96.77 0.11
28 Sep 01,2016 Houston-The Woodlands-Sugar Land, TX Texas South Region 186.68 29.77 -95.39 0.04
29 Sep 01,2016 San Antonio-New Braunfels, TX Texas South Region 190.4 29.46 -98.51 0.07
30 Sep 01,2016 Seattle-Tacoma-Bellevue, WA Washington West Region 203.79 47.62 -122.35 0.13
Source: Freddie Mac House Price Index

For tractability I restricted the number of metro areas, roughly corresponding to the top 20 metro areas based on population.

Here’s a map view of the metros in our data:

plot of chunk jan222017-map1

Building a dashboard

The idea behind this dashboard is to compare housing market conditions across areas and across time. These data automatically lend themselves to these comparisons. Indeed, the very nature of a house price index is to compare trends in average quality-adjusted house prices over time.

Flexdashboards are a powerful tool for visualizing data. We will combine multiple interactive plots together into a single self-contained webpage.

Getting started - Data

First we need to load our data. As we’re going to use crosstalk to enable our widgets to talk to each other, we’ll also need to do set up some Shared Data. The shared data can act like data frame in compatible HTML widgets but respond to selections and filters.

In the code below we load the data with metro house prices and create the Shared Data:

df<-fread("data/hpimetro.csv")
df$date<-as.Date(df$date, format="%m/%d/%Y")
# Set up metro data for cross talk:
df.metro<-group_by(df[year(date)>1999,],geo)
sd.metro <- SharedData$new(df.metro, ~geo)

Layout

Our dashboard is going to have several pages. To explore the different layout options we’ll create four pages:

  1. A general information/about page
  2. A storyboard page
  3. A page with an interactive widget we can filter
  4. A page with an animated chart

Getting stated

Let’s walk through the construction of each of these individual pages, starting with the landing page.

About

The about page is quite important as it is where our new visitors will land. We want to include a brief description along with some hints at what else is in the dashboard. But we want to do it without overwhelming visitors. For this page we’ll include:

  1. Short introductory text
  2. A map (same as above) showing the metros in our data.
About {data-navmenu="Explore"}
===================================== 

Column {data-width=200}
-------------------------------------

### About this flexdashboard

This dashboard allows you to explore trends in house prices across 30 large metro areas. The metro areas covered are depicted in the nearby map.  The map is colored  according to Census regions.  We picked the 30 largest metro areas based on population. Explore the different data visualizations above.

Column {data-width=800}
-------------------------------------

### Areas covered

# Run this to create map
#```{r jan222017-ex1-map,echo=F}
g.map<-
  ggplot(df[date=="2016-09-01" ], aes(x = long, y = lat,label=geo)) +
  geom_map(data=df.state[date=="2016-09-01",],aes(fill = region,map_id=tolower(statename)), map = states_map,alpha=0.25)+
  borders("state",  colour = "grey70",alpha=0.4)+
  theme_void()+
  scale_fill_viridis(name="Census Region",discrete=T,option="C")+
  theme(legend.position="top",
        plot.title=element_text(face="bold",size=18))+
  geom_point(alpha=0.85,color="black",size=2)+
  geom_text(hjust=0,size=1.75,nudge_y=-0.7)+
  labs(title="Metro areas in our data",
       subtitle="30 large metro areas",
       caption="@lenkiefer Metro population based on U.S. Census: http://www.census.gov/programs-surveys/popest.html")+
  theme(plot.caption=element_text(hjust=0,size=7))
# ```

Some things to note about this page. We want the navigation to be collapsed. The default is for each page to get its own link on the top navigation, but by selecting About {data-navmenu="Explore"} we force this page to fall under the “Explore” link at the top. We also want the map to take up most of the space, so we set {data-width=200} for the first column and {data-width=800} for the second. This ensures the map gets 80% of the available width.

Storyboard

Now we can move on to the second page, which uses a storyboard. Consider the code below:

Storyboard {.storyboard data-navmenu="Explore"}
=========================================

### Map of areas we plot

#```{r}
g.map
#```

### Small multiple, House Price Index

#```{r sm-1-jan22-2017,fig.width=10}
g1<-ggplot(data=df,aes(x=date,y=hpi))+geom_line()+facet_wrap(~geo)+
  theme_minimal()+labs(x="",y="",title="House price index by metro",
                       caption="@lenkiefer Source: Freddie Mac House Price Index")+
  theme(plot.caption=element_text(hjust=0,size=7),        plot.title=element_text(size=10),
        strip.text=element_text(size=4),
        axis.text.x=element_text(size=4)  ,
              axis.text.y=element_text(size=5)   )
g1

### Small multiple, Annual house price appreciation

#```{r sm-2-jan22-2017,fig.width=10}
g2<-ggplot(data=df,aes(x=date,y=hpa12))+geom_line()+facet_wrap(~geo)+
    theme_minimal()+labs(x="",y="",title="Annual percent change in house price index by metro",
                       caption="@lenkiefer Source: Freddie Mac House Price Index")+
  scale_y_continuous(label=percent)+
  theme(plot.caption=element_text(hjust=0,size=7),        plot.title=element_text(size=10),
        strip.text=element_text(size=4),
        axis.text.x=element_text(size=4)  ,
              axis.text.y=element_text(size=5)   )
g2
#```

We start the storyboard page by declaring that this page has a storyboard structure with Storyboard {.storyboard data-navmenu="Explore"}. Note we also force this page to belong under the “Explore” navigation. By adding .storyboard this tells the flexdashboard to arrange subsections on different storyboard panes.

In the code above I included the first three panes (corresponding to the map g.map and graphs g1 & g2). In the full dashboard I actually include 7 panes. The text we include under the headers (denoted with ###) will be included in the story pane navigation filmstrip.

Interactive chart

So far, the elements we have included are standard, and well described in the flexdashboard documentation. These next two pages are more complex. The first, an interactive chart uses crosstalk and plotly to create a dynamic interactive chart in a static webpage.

Crosstalk allows htmlwidgets to talk to one another on a static webpage. What we are going to do is create three plotly graphs and have them linked via crosstalk and include a filter box.

First we need to create the widgets, which are individual plotly charts:

g.map<-
  ggplot(sd.metro, aes(x = long, y = lat)) +
  borders("state",  colour = "grey70",fill="lightgray",alpha=0.5)+
  theme_void()+
  theme(legend.position="none",
        plot.title=element_text(face="bold",size=18))+
  geom_point(alpha=0.82,color="black",size=3)+
  labs(title="Selected Metro(s)",
       subtitle=head(df,1)$geo,
       caption="@lenkiefer Source: Freddie Mac House Price Index through September 2016")+
  theme(plot.caption=element_text(hjust=0))

p0<-
   plot_ly(data=sd.metro,x = ~date, y = ~hpi, height=750) %>% 
    add_lines(name="Index",colors="gray",alpha=0.7) %>% 
    add_lines(name="All metros",data=df,x=~date,y=~hpi,
              colors="black",color=~geo,alpha=0.1,showlegend=F,hoverinfo="none") %>%
     layout(title = "House Price Trends by Metro",xaxis = list(title="Date"), yaxis = list(title="House Price Index"))

p1<-
   plot_ly(data=sd.metro,x = ~date, y = ~hpa12, height=750) %>% 
    add_lines(name="Annual % change",colors="gray",alpha=0.7) %>% 
    add_lines(name="All metros",data=df,x=~date,y=~hpa12,
              colors="black",color=~geo,alpha=0.1,showlegend=F,hoverinfo="none") %>%
     layout(title = "House Price Trends by Metro",xaxis = list(title="Date"), yaxis = list(title="Annual % Change in House Price Index"))

g.map is a ggplot2 graph while p0 and p1 are plotly graphs. We will apply ggplotly to convert our ggplot map into a plotly thing.

Once we have the graphs, we can combine them using the crosstalk function bscols and include a filter_select to filter the charts. The code is not very long:

bscols(widths=c(2,6,4),
  list(filter_select("metro", "Select metro to highlight for plot", sd.metro, ~geo,multiple=FALSE)),
  subplot(p0,p1,nrows=2,titleY=T),
  ggplotly(g.map)
  )

The bscols function first allocates our graphs over the page with widths. Next, we include a filter_select that uses the SharedData sd.metro (discussed above). We set multiple equal to FALSE so that only one metro can be selected at a time.

Animation

Our final page is an animated chart. Animations require the development version of plotly for R. Install via:

devtools::install_github("ropensci/plotly")

The animation is pretty straightforward. Once again, we link the data through SharedData. In our plots, we income a frame argument inside of aes. Then we instruct plotly to animate the graphs:

g.map2<-
  ggplot(sd.metro, aes(x = long, y = lat,frame=geo,label=paste("\n\n  ",geo),color=geo)) +
  borders("state",  colour = "grey70",fill="lightgray",alpha=0.5)+
  theme_void()+
  theme(legend.position="none",
        plot.title=element_text(face="bold",size=18))+
  geom_point(alpha=0.5,size=1)+
  geom_text(hjust=0)+
  labs(title="House price trends around the U.S.")+
  theme(plot.caption=element_text(hjust=0))

p2<-
  ggplot(data=sd.metro,aes(x=date,y=hpa12))+
  #geom_point()+geom_segment(aes(xend=date,yend=0))+
  geom_line(aes(frame=geo,ids=date,label=geo,color=geo))+
  scale_y_continuous(labels=scales::percent)+
  geom_line(data=df.us,color="gray",linetype=2)+
  #scale_fill_viridis()+  scale_color_viridis()+
  theme_minimal()+labs(x="",y="Annual House Price Appreciation y/y %",title="Annual Price Growth")
#+    geom_text(data=d3.m[date==median(d3.m$date)],fontface="bold",y=16,size=8)

p3<-
  ggplot(data=sd.metro,aes(x=date,y=hpi))+
  geom_line(aes(frame=geo,ids=date,label=geo,color=geo))+
  theme_minimal()+labs(x="",y="House Price Index",title="House Price Index")+
  geom_line(data=df.us,color="gray",linetype=2)


subplot(subplot(ggplotly(p3),ggplotly(p2),nrows=2,titleY=T), ggplotly(g.map2),  nrows = 1, widths = c(0.35, 0.65), titleX = TRUE,titleY=T) %>%
  hide_legend() %>%
  animation_opts(2000,transition=500) %>% layout(title="House price tour, solid line metro, dotted line U.S.")

I want to arrange the graphs, so I use a nested call of plotly’s subplot function.

Putting it all together

Combining all these steps we can create the following dashboard:

You can see a fullscreen version here.

Next Steps

I’ve found flexdashboards to be a fun way to interact with data. Hopefully more htmlwidgets will be made compatible with crosstalk. But because plotly allows you to translate most ggplot graphs into widgets, there’s already a huge potential with what’s available.

Perhaps you will find flexdashboards to be something you would like to explore. Hopefully this guide can be helpful, by giving you a working example of several features.