# Statistical techniques

## Working with R in SPSS Part 1: The Basics

In this four-part blog series, we’re going to take a look at working with R in SPSS. We’ll begin by exploring the basics of running R procedures in SPSS syntax and end up showing you how to create your own SPSS custom dialogs based on R code. We’re able to run R code in SPSS …

## Working with R in SPSS Part 2 – Working with R Packages

In the previous blog post, we looked at the basics of running R procedures in SPSS syntax. In this post, we’re going to explore how to work with R packages. Packages are collections of functions and pre-built compiled code that enable R users to carry out a vast range of analytical and data manipulation tasks. …

## Working with R in SPSS Part 3 – Creating Custom Dialogs

Previously, we explored how to install and load an R package to generate correlograms by executing R code within SPSS syntax. In this post, we’re going to look at how we can build a custom dialog in SPSS Statistics that encapsulates much of our previously created code and allows us to share the procedure with …

## Working with R in SPSS Part 4 – Adding options to Custom Dialogs

In the previous blog post, we created a basic custom dialog using R code in SPSS Statistics that allowed us to generate a visual representation of a correlation matrix i.e., a correlogram. In this final posting, we will see how to add a number of additional elements that control the styling and appearance of the …

## What is correlation and when is it useful?

Correlation is a term that we employ in everyday speech to denote things that appear to have a mutual relationship. In the world of analytics correlations are specific values that are calculated in order quantify the relationships between variables. This kind of analysis is powerful because it allows us measure the association between factors such as …

## What is a Chi Square test and when would you use it?

The chi-squared test is used to examine differences with between fields with different categories and to determine whether those differences are significant.

## What is the difference between the various types of statistical models?

The real world, whether it be the physical world, for example machines, or the natural world, for example human and animal behaviour, is very complex with many factors, some unknown, determining their behaviour and responses to interventions. Even if every contributory factor to a phenomenon is known, it is unrealistic to expect that the unique …

## How do I choose the correct statistical test?

When you’re conducting any kind of statistical analysis, it’s vital that you select the correct tests to perform, given the characteristics of your data and the analytics outcomes that you’re hoping for. If you don’t choose the right tests then the results you generate can be meaningless and this can lead to business decisions being …

## Understanding correlation

This is the latest in our ‘eat your greens’ series – a back to basics look at core statistical concepts that are often misunderstood or misapplied. In everyday speech the term ‘correlation’ refers to a mutual connection or relationship between two things. In statistics correlations are specific measures or values that attempt to quantify the …

## The gateway to inference – the standard error and confidence intervals

This is the fifth post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of the standard error and confidence intervals. In …

## Finding normality – why is the normal distribution so important when we so rarely encounter it in real life?

This is the fourth post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of the normal distribution.  One of the first …

## Are algorithms evil?

None of us can have failed to notice the recent debacle over Ofqual’s (the Office of Qualifications and Examinations Regulation) use of an algorithm to predict pupil grades. Once again, ‘algorithm fever’ has generated a flurry of news articles questioning whether we are sleepwalking into a dystopian future where human expert decision-making is replaced with …

## What’s ‘standard’ about a standard deviation?

This is the third post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of the standard distribution. I’m often struck by …

## Testing versus inferring

This is the second post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of testing versus inferring. One of most daunting …

## Are Log scales appropriate for COVID-19 Charts?

You may have noticed that many media outlets are illustrating the tragic course of the coronavirus (COVID-19) pandemic by employing charts with log scales. These log values refer of course to the mathematical concept of logarithms. This is something that most of us learn about in school when we are taught that a logarithm is …

## An overview of the four main approaches to predictive analytics

This infographic provides an overview of the four main families of approaches to predictive analytics. Prediction encompasses applications that aim to estimate or predict the values of a key target field. Segmentation refers to techniques such as cluster analysis which attempt to find the most ‘naturally occurring” groups within a dataset. Association modelling discovers groups …

## Statistics in court: the story of a dataset

Like a lot of consultants working in the analytics industry, I’ve built up an extensive portfolio of materials to illustrate different kinds of applications and approaches. Some of these consist of files and slide decks used to explain quite esoteric procedures such as TURF analysis or Partial Least Squares. However, there are certain materials that …

## 7 things you need to know about key driver analysis (KDA)

In most businesses it’s not enough to simply be measuring outcomes like customer satisfaction, sales, customer churn rates, subscription renewals, customer loyalty, cancellation rates and so on. To gain competitive advantage you also need to know what’s driving those outcomes. Which aspects of the service you provide most influence how likely someone is to renew …

## What do we mean when we talk about data modelling? An overview of different types of models

The real world, whether it be the physical world, for example machines, or the natural world, for example human and animal behaviour, is very complex with many factors, some unknown, determining their behaviour and responses to interventions. Even if every contributory factor to a phenomenon is known, it is unrealistic to expect that the unique …

## What is a chi-squared test and when would you use it?

An in depth guide to using the chi-squared test to determine statistical significance.

## What is correlation and why is it useful?

What is correlation? Correlation is a term that we employ in everyday speech to denote things that appear to have a mutual relationship. In the world of analytics correlations are specific values that are calculated in order quantify the relationships between variables. This kind of analysis is powerful because it allows us measure the association between …

## What do your customers care about most? Using key driver analysis to find out.

After basic significance tests, T-tests, Z-tests and so on, key drivers analysis (KDA) is probably the second most popular statistically-based technique in market research. Given an outcome of interest a KDA gives us a measure of the relative importance of a set of attributes (potential drivers).Typical outcomes of interest in research are: Satisfaction – customer, employee etc. Purchase intent – how …

## What makes for good data visualisation?

Data visualisation is a hot topic at the moment. And with good reason: a picture paints a thousand words … and the better ones can convey clearer meaning than a similar volume of numbers. There is also an ever-growing list of charts and infographics, both in the public domain and in research deliverables. They are not totally new …