A Case Study In The Application Of Statistics & Probability In Web AnalyticsExplaining what happened in the past is a common task for most web analytics professionals. While some will argue that web analytics is like driving by watching the rear view mirror, understanding where problems lie is a key step to driving improvements. The presentation, the Case of the Missing Ring, is a case study that focuses on explaining the slow and gradual decline in traffic to the contemporary jewellery website DefiniteStyle, and in doing so outlines a number of critical concepts that web analytics professionals must integrate into their day-to-day practices in order to be effective. These include:
- A basic understanding of probability and statistical concepts.
- Gathering data from multiple sources.
- Applying an iterative approach to analysis.
- Trying to prove yourself wrong in order to get closer to the truth.
Understanding Probability and Statistical ConceptsWhile I can't claim to have a large enough sample size to make the claim that statistical literacy is lacking in the web analytics profession, my anecdotal experience is that few professionals working in web or digital analytics have sufficient statistical skills to do justice to our profession.
Challenge #1: how do you answer when asked the question "what is the average bounce rate?" or "is a 2% conversion rate good?" In my experience these are very common questions that I hear from our clients and prospects. Challenge #2: If you have ever used the term "statistically significant", can you simply explain the concepts of confidence level and confidence interval? Think carefully before reading on.The web/digital analytics industry is still young, very young in the scheme of things. WebTrends was founded in 1995 and a few tools predate this by a couple of years, but when compared to the discipline of statistics with hundreds of years of development, there is so much that web analytics professionals can learn. Statistics helps us to understand the past and to filter out that which is not important from that which may be important. From this we can make future forecasts and estimate the probability that a change we make will result in a positive or negative impact. In the first challenge above, both questions are significantly flawed. Firstly, to address any question that involves an average, we must understand how that average is constituted, including its variance and composition, best described through the concepts of the standard deviation and probability distribution. Secondly, both questions imply further questions about the population we are studying. The average bounce rate for visitors looking for contact details would be far higher than for those looking for product information. There are many more subtleties that influence how an analyst should respond to these questions. The sad reality is that more often than not the answer given is something like "35% bounce rate is good" or "2% conversion is acceptable" without any significant thought about the actual question that needs to be answered and therefore the accuracy of the answer.