Are Log scales appropriate for COVID-19 Charts?

You may have noticed that many media outlets are illustrating the tragic course of the coronavirus (COVID-19) pandemic by employing charts with log scales. These log values refer of course to the mathematical concept of logarithms. This is something that most of us learn about in school when we are taught that a logarithm is the power to which a number must be raised in order to get some other number. If, for example, we raise the value 10 to the power of 2, we get 100. Therefore the ‘base 10’ log of 100 is 2. If this all seems rather […]

An overview of the four main approaches to predictive analytics

This infographic provides an overview of the four main families of approaches to predictive analytics. Prediction encompasses applications that aim to estimate or predict the values of a key target field. Segmentation refers to techniques such as cluster analysis which attempt to find the most ‘naturally occurring” groups within a dataset. Association modelling discovers groups of categories that are likely to co-occur such as items in a shopping basket. Forecasting refers to methods that extrapolate time trends such as sales of products so businesses can anticipate future demand. Each family is further detailed with colour coded examples of the popular […]

Statistics in court: the story of a dataset

Like a lot of consultants working in the analytics industry, I’ve built up an extensive portfolio of materials to illustrate different kinds of applications and approaches. Some of these consist of files and slide decks used to explain quite esoteric procedures such as TURF analysis or Partial Least Squares. However, there are certain materials that can be used to demonstrate such a wide number of statistical and predictive analytics techniques, that I’ve found myself immediately reaching for them again and again over the years. One of these is the SPSS Statistics sample dataset ‘Employee data.sav’. Most statistical software programs come […]

7 things you need to know about key driver analysis (KDA)

In most businesses it’s not enough to simply be measuring outcomes like customer satisfaction, sales, customer churn rates, subscription renewals, customer loyalty, cancellation rates and so on. To gain competitive advantage you also need to know what’s driving those outcomes. Which aspects of the service you provide most influence how likely someone is to renew their subscription at the end of the year? Which factors most drive recommendations? Key driver analysis (KDA) can help you to answer these kinds of questions. KDA enables you to look for relationships between aspects of customers’ attitudes, needs and behaviours that you’re interested in, […]

What do we mean when we talk about data modelling? An overview of different types of models

The real world, whether it be the physical world, for example machines, or the natural world, for example human and animal behaviour, is very complex with many factors, some unknown, determining their behaviour and responses to interventions. Even if every contributory factor to a phenomenon is known, it is unrealistic to expect that the unique contribution of each factor to the phenomenon can be isolated and quantified. Thus, mathematical models are simplified representations of reality, but to be useful they must give realistic results and reveal meaningful insights. In his 1976 paper ‘Science and Statistics’ in the Journal of the […]

What is a chi-squared test and when would you use it?

Take a look at the table below. It describes a relatively common situation in business analytics. Two offers have been made to a sample of 40,000 prospective readers of a magazine. As an experiment, half of the prospects have been offered a 25% discount for the first year and the other half have been offered an extended subscription of 15 months (rather than the normal 12 months). The table seems to indicate a slight increase in the response rate (a mere 0.4%) for those offered the extended subscription. The business analysts want to know how probable it is that this […]

What is correlation and why is it useful?

What is correlation? Correlation is a term that we employ in everyday speech to denote things that appear to have a mutual relationship. In the world of analytics correlations are specific values that are calculated in order quantify the relationships between variables. This kind of analysis is powerful because it allows us measure the association between factors such as advertising spend and website hits, product sales and competitor pricing, Net Promoter Score and customer discount, ambient temperature and component part failure. Not only can we measure this relationship but we can also use one variable to predict the other. For example, […]

What do your customers care about most? Using key driver analysis to find out.

After basic significance tests, T-tests, Z-tests and so on, key drivers analysis (KDA) is probably the second most popular statistically-based technique in market research. Given an outcome of interest a KDA gives us a measure of the relative importance of a set of attributes (potential drivers).Typical outcomes of interest in research are: Satisfaction – customer, employee etc. Purchase intent – how likely a customer is to make a purchase How likely a respondent is to recommend a product or service (sometimes known as a Net Promoter Score) Intent to switch or cancel This isn’t an exhaustive list and outcomes can be stated or observed (through other data sources like transactional databases) where […]

What makes for good data visualisation?

Data visualisation is a hot topic at the moment. And with good reason: a picture paints a thousand words … and the better ones can convey clearer meaning than a similar volume of numbers. There is also an ever-growing list of charts and infographics, both in the public domain and in research deliverables. They are not totally new of course as the Guardian newspaper demonstrated a while back with some historical examples. However, not all of today’s visualisations achieve their analytical/informational objectives. In order for data visualisation to be effective it is important that we keep sight of some long held principles about the […]