In this post I’m going to generate some simple random variables. I’m also going to use some purrr tricks I picked up from Jenny Bryan’s excellent purrr tutorial to help manage our simulated data.
For this example, I’m just going to simulate random variables from small number N=10 of distributions. For each of the N distributions I’m going to draw 100 pseudo random draws. Each distribution will be a normal distribution, with a different mean and variance. We’ll draw means from a standard normal distribution and standard deviation from the uniform distribution. This will give us some “Normal” looking (see what I did there?) random variables with just enough variation to be visually interesting.
First, let’s load our libraries and draw some data.
Now we have some metadata, but what we’d like to do is generate samples for each distribution described in the table above.
This is a purrrfect time to use purrr and its powerful map() functions. We can generate a whole bunch of data by running the bit below:
In the code above I used the pmap() function with three arguments, S for sample size, mean for the population mean and sd for the population standard deviation. Before I called the unnest() function my data frame would have data frames stored in columns. Using unnest() allowed me to unpack the data. I ended up with 1000 observations, corresponding to 100 draws from 10 different distributions.
Now we’re ready for some visualization.
Let’s make a joyplot using gradient shading for our distributions.
A note of forcats
It might be desirable to reorder these variables by something other than id. For example, we have the true mean saved in our dataframe. We can sort the id factors using the forcats::fct_reorder() function (see forcats tidyverse page). I’ve found it useful so I’m posting this bit here (you’re welcome future Len).
The joyplots are really cool, but there are other plots to show distributions. Plots like beeswarm](https://github.com/eclarke/ggbeeswarm) plots. Let’s make one using our data.
But wait. If joyplots are cool, and beeswarm plots are hot, what do we get if we put them together? Something pretty awesome I think. And it’s super easy.
And of course, they’re kind of fun to animate. Like so:
Joyswarms in the wild
These could be useful out in the wild. I’ve been experimenting with some real world data (check my Twitter feed). And in the future, maybe I’ll share some examples here.
That’s a great candidate for a joyswarm plot. Let’s make one.
We’ll use the approach I outlined here to get the data. Let’s download the data and plot a time series.
These data are not seasonally adjusted, so you can see the pronounced seasonal variation. You can also see large volatility around the Great Recession. Let’s first create a joyplot using month on the Y axis. We’ll also add dots showing the 2017 values.
Here we can see that the 2017 values are quite low by historical standards.
Let’s try a joyswarm:
It’s hard to beat good ol’ boxplots
Though I like joyplots, beeswarm plots, and joyswarm plots a lot, when it comes to clarity it’s hard to beat good ol’ boxplots (see below). However, these plots do create some visual interest and help stimulate thinking about the data.