In the last few weeks I have been quite immersed in data visualization, trying to understand how it can be used to turn data into insights. As part of my immersion, I have played with Fusion Tables and Google Analytics
, and also other ideas that will come to light in the future... As I wrote in the Fusion Tables article, I think everyone secretly wishes to do crazy visualizations with Google Analytics data sometimes, both because it can very insightful or just incredibly fun :-)
And here I am again, with another custom visualization! But this time I decided to use the R programming language
, which is considered to be one of the best options when it comes to statistical data visualization.
As I looked deeper into R, I tried to understand what kind of visualizations would complement Google Analytics (GA), i.e. what can we get out of R that we can't currently get out of GA. My first idea was to try and create a visualization that would allow me to look at my top 5 US states by number of visits (or countries if you wish) and see how they are performing side by side. In addition, I wanted to see how Christmas and
a TV campaign affected the behavior across US States. While this is possible to understand using Google Analytics, I believe it would not be possible to visualize it in such a way
Once I found this interesting use case, I decided to take my artistic capabilities out of the rusty box and sketch the output I was looking for... and here is what I got.
With this objective in mind, I rolled up my sleeves and started working... Below is a step-by-step guide on how to build a very similar visualization using your own Google Analytics data.
If you know your way through R, you can simply download this commented txt file
Important: please note that while I try to describe the process as detailedly as possible, an introduction to R would be very recommended. If you have some time to invest try the Computing for Data Analysis Coursera course, or just watch the YouTube playlist Intro to R. I am also providing a list of helpful books in the end of the article.
Installing R, the Google Analytics package and others
If you are completely new to R, you will first need to download R
and follow the instructions to install it. After you do that, I recommend you also install R Studio
, a great tool for you to write and visualize R code.
the Google Analytics package into your R workspace (below I am using version 1.4). If you don't know where is your workspace just type the line below into your console.
Enter the following lines into R to install and load the respective packages, they are necessary for this visualization.
install.packages(c("RCurl", "rjson", "RGoogleAnalytics", "ggplot2", "plyr", "gridExtra", "reshape"))
Getting the data and preparing it for visualization
Authorize your account and paste the accesstoken - you will be asked to paste it in the console after you run the second line below.
Step 2. Initialize the configuration object - execute one line at a time.
Step 3. Check the ga.account and ga.webProperty lists above and populate the numbers inside [ ] (i.e., substitute 9 and 287) below with the account and profile index you want (the index is the first number in each line of the R console). Then, get the webProfile index from the list below and use it to populate the first line of step 5.
Step 4. Create a new Google Analytics API object.
Step 5. Setting up the input parameters - here you should think deeply about your analysis time range, the dimensions (note that in order to do a line chart for a time series you must add the "ga:date" dimension), metrics, filters, segments, how the data is sorted and the # of results.
Step 6. Build the query string, use the profile by setting its index value.
query$Init(start.date = "2013-12-08",
end.date = "2014-02-15",
dimensions = "ga:date, ga:region",
metrics = "ga:visits, ga:avgTimeOnSite, ga:transactions",
sort = "ga:date, -ga:visits",
max.results = 10000,
table.id = paste("ga:",ga.webProfile$id,sep="",collapse=","),
Step 7. Make a request to get the data from the API.
Step 8. Check your data - head() will return the first few lines of the table.
Step 9. Clean the data - removing all (not set) rows.
Step 10. Choose your data - get the data for the specific states (or countries) that you want to analyze. Notice that I am using only the Top 5 countries as I think more than that would be a bit too much to visualize, but it is up to you.
Step 11. Build the final table containing only the countries you want.
Building the visualization: legends and line charts
Step 12. Build the special campaign bars and legend (in this case Christmas and Campaign)
Step 13. Build the chart legend and axis.
Step 14. Build the charts!
Phew! Here is the chart you should get!
The chart is not exactly what I initially thought, but having all metrics in one chart was a bit problematic as the scale was too different and we would barely see the transactions chart. But I like it this way :-)
If you already use R to analyze and visualize Google Analytics data, send us an email, we would love to publish other examples.
Books to learn R
- Learning R
- R Graphics Cookbook
- ggplot2: Elegant Graphics for Data Analysis
- Discovering Statistics Using R