test

Statistics in court: the story of a dataset

Like a lot of consultants working in the analytics industry, I’ve built up an extensive portfolio of materials to illustrate different kinds of applications and approaches. Some of these consist of files and slide decks used to explain quite esoteric procedures such as TURF analysis or Partial Least Squares. However, there are certain materials that can be used to demonstrate such a wide number of statistical and predictive analytics techniques, that I’ve found myself immediately reaching for them again and again over the years. One of these is the SPSS Statistics sample dataset ‘Employee data.sav’. Most statistical software programs come […]

6 secrets of building better models part one: bootstrap aggregation

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part two: boosting

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part three: feature engineering

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part four: ensemble modelling

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part five: meta models

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part six: split models

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

What’s new in IBM SPSS Statistics v26?

In April of this year, IBM released the latest version of SPSS Statistics. Version 26 introduces a number of additional analysis procedures as well as new command enhancements. If you’re an existing SPSS user and you’d like to upgrade to v26 there’s more information about how to do that here. If you’re interested in trying SPSS Statistics for the first time then do please get in touch – we’ll be happy to help.  New analytical procedures Quantile Regression In standard ‘least squares’ regression the model predictions are based on a single regression line. This line can be used to estimate the […]

Expert insight: Clifford Budge, Head of Ecommerce Data Science, major toy manufacturer

Lorna: Please could you start just by talking a little bit about your own background and how you’ve come to be working in data? Cliff: I went to university to do a mapping science degree, during which time I had a year out in industry. I worked as a surveyor for a company that was laying the cables for the first optical fibre telecommunications network in the UK. Within about six weeks of starting the job, I got access to the marketing data and thought I could use my mapping skills to create a visual representation of the sales and […]

I work for an analytics company, but I don’t want to talk about analytics

Here is an interesting confession. I work for an analytics company but I would really rather not think about or talk about analytics. Let me explain. I once worked with an operations director. He had a great team of analysts. He valued their work and was interested in the information they gave him but the mention of algorithms or spreadsheets made his eyes glaze over. When I met him for the first time he happened to be in the room where we were meeting his team. The technically-minded contingent quickly got caught up in fast-paced conversation. There was a feeling […]

Regular Expressions for IBM SPSS Modeler: performance comparison

The Regular Expressions for IBM SPSS Modeler node pack provides 4 nodes that integrate the power and flexibility of regular expression pattern matching into SPSS Modeler. However, some of these capabilities can be supported using the extension nodes built into SPSS Modeler and that begs the question – why buy the Regular Expression nodes? One obvious answer is ease of use. The extension nodes built into SPSS Modeler require expertise in either R or Python programming languages since they are general “code” nodes. Although many data scientists may already have that expertise, most people use SPSS Modeler because of its […]

Don’t overlook the value of quick wins in your analytics project

I was talking to a potential client recently who was frustrated by the lack of ‘real analysis’ happening in his organisation. He had a team of analysts working for him, most of whom were using Excel when he really wanted them to be using Python. Although we don’t generally recommend Excel for advanced analytics – see my earlier blog on this topic – it’s still a useful tool that can generate real insight and is heavily in use in many organisations. Quick wins are important This idea that ‘real analytics’ only happens if you’re using what are, for many people, […]

A first look at SPSS Modeler v18.2

In this video Jarlath Quinn takes a first look at SPSS Modeler v18.2 and demonstrates some of the new functionality that’s included within this release. IBM® SPSS® Modeler adds the following features in this release. New look and feel. A new modern interface theme is available via Tools > User Options > Display. For instructions on switching to the new theme. New data views. You can now right-click a data node and select View Data to examine and refine your data in new ways with advanced data visualizations. IBM Data Warehouse. Database modeling with IBM Netezza Analytics now supports IBM Data Warehouse. Gaussian Mixture node. A new Gaussian Mixture node is available on […]

How to get the most out of your SPSS Statistics free trial

Smart Vision offer a free copy of SPSS Statistics that will give you access to the software for a trial period of two weeks. This video will talk you through how to download and install the software before providing a series of examples of procedures that you might want to try out yourself in order to explore SPSS. You should be aware that the trial software includes all of the SPSS Statistics optional add on modules and you can find out more about these modules and see a two minute overview of each one elsewhere on this site. Once installed […]

Expert insights: Major Lester, founder of SPSS UK talks about fifty years of SPSS

This year SPSS is 50 years old. Development started in 1965 by a team of political scientists, frustrated at how much time they had to spent on manual data cleaning before they could start their analysis. The first release was in 1968 and by the end of the 1960s SPSS was in use in over 60 universities. SPSS expanded into the UK in 1985 and Major Lester was its first employee. We spoke to him about the history of SPSS, why it’s been so successful and how he sees its future. Lorna: Could you start off just by talking to […]

Three questions to ask when reading articles about artificial intelligence

You may have noticed by now that there seem to be a couple of recurring themes in the plethora of articles and news programmes about artificial intelligence (AI). These themes can be summed up as a) “The dangers of AI” and b) “The limitations of AI”. Articles addressing the dangers of AI tend to focus on issues such as the threat of widespread job losses to AI, the possibility of inherent bias (such as racism and sexism), the lack of transparency in decisions made by AI systems and, as a result, the inability to plead your case with AI (“Computer […]

Thinking of using spreadsheets for advanced analytics? Think again.

When we’re talking to potential clients about advanced analytics we often ask them what tools they’re currently using. More often than not they say they’re using spreadsheets. Spreadsheets are one of the most widely used tools for statistical analysis and of course, most businesses couldn’t run without them. However, when it comes to advanced analytics spreadsheets have some very significant limitations. Use them beyond their capabilities and the potential cost can be significant. As with anything, it’s important to use the right tool for the job. So, what are the things you need to consider when it comes to using […]

Expert insight – Paul Jackson, Head of Advanced Analytics, Bonamy Finch

Can you tell us a bit about yourself, your background and how you came to be where you are career-wise? I started my career in 2001 at Research International’s Marketing Science Centre, following a degree in Sociology and Social Policy. Back then I was working mainly on analysis of market research surveys. Over time I built up my expertise in branding and segmentation and then joined Bonamy Finch, providers of advanced analytics services to global clients around the world, in 2007. Back then we were mainly serving clients and agencies with statistical analysis and support on surveys, mostly Segmentation, but […]

Expert insight – John Gill, Head of Insight and Analytics, Betfred

Can you start off by telling us a bit about your background, and how you came to be working in the analytics field? I came to analytics via a somewhat circuitous route. My first degree was in psychology, then I did a postgraduate qualification in newspaper journalism. Obviously psychology is all about the science of behaviour which is very relevant to the field I’m now in – analytics and gaming. Newspaper journalism has also been very useful as it’s given me an ability to communicate facts in a way that’s compelling, and that people can understand. Professionally, I started out […]

How to change the appearance of your output in SPSS Statistics

We’re often asked how you can change the appearance of the tables that SPSS generates as output. In this video Jarlath Quinn demonstrates two different ways to do this, either by choosing a different table look in the edit / options function, or by editing the table properties directly yourself.

How to merge files in SPSS Statistics

In this video Jarlath Quinn demonstrates how to merge data files within SPSS Statistics using each of the two main methods, either adding cases (combining files with the same fields but additional rows) or adding variables (combining files by joining variables to a target file using something like an ID field as a ‘keyed variable’).

How to create grouped or banded variables in SPSS Statistics

SPSS users often want to be able to create grouped or banded data from continuous fields such as, for example, creating age groups or income bands from continuous fields. In this video Jarlath Quinn demonstrates how to use the visual binning procedure within SPSS Statistics to do this including how to control the proportion of cases that fall into each band and how to automatically create value labels.

How to recode your data in SPSS Statistics

Recoding your data means changing the values of a variable so that they represent something else. Within SPSS Statistics there is more than one type of recode that can be performed. In this video Jarlath Quinn demonstrates how to:- Recode into the same variables, overwriting an existing variable Recode into different variables, creating a new variable in addition to your existing variables Automatically recode, a particular procedure designed to change string codes into numeric codes Visual binning, visualising a distribution in the form of a histogram and slicing it into ranged categories

How to check your data for normality in SPSS Statistics

When you’re deciding which tests to run on your data it’s important to understand whether your data is normally distributed or not, as a lot of standard parametrical tests assume a normal distribution whereas other non-parametric tests are designed to be run on data which is not normally distributed. A normal distribution has a number of characteristics:- It is symmetrical It is bell-shaped Its mean, median and mode all appear at the same place Normal distributions can be divided up into the same proportions by the standard deviations, so 95% of the area under the curve lies within roughly plus […]

How to calculate with dates in SPSS Statistics

In this video Jarlath Quinn demonstrates how to work with date and time variables in SPSS using the SPSS date and time wizard. This enables you to:- Calculate time units between two dates Add / subtract time units to or from dates Extract part of a date or a time, such as days of the week or months of the year Create date or time variables from variables holding part of dates or times

How to select cases in SPSS Statistics

In this video Jarlath Quinn demonstrates how to use SPSS Statistics to define data filters in order to select particular cases for analysis. This can be done either to create a temporary selection or to create a permanent new file with only a subsection of cases included within it. The video demonstrates how to do this with string variables too, as well as how to combine conditions from multiple variables in your selection.

How to reverse a scale in SPSS Statistics

In this video Jarlath Quinn demonstrates how to reverse the values of a rating scale (such as an agreement scale or a satisfaction scale) in SPSS Statistics, so that the highest value becomes the lowest value and vice versa. Jarlath shows two methods of doing this – one using the compute procedure and the other using the recode procedure.

How to combine variables in SPSS Statistics

SPSS users often want to know how they can combine variables together. In this video Jarlath Quinn demonstrates how to use the compute procedure to calculate the mean of a number of variables to create one combined variable, and also how to use the count values procedure to count how many times a particular value occurs across a series of variables in order to create an overall count.

Expert insight – Nick Di Paulo, Lead Customer Researcher, Hyde Housing Group

Can you start off just by talking a little bit about your background and how you came to be working in analytics in the first place?  Yes, I did research methods at university and was experienced at using spreadsheets and things like that, and that’s where I picked up SPSS, and then I got into survey design and have been working with it for seven or eight years now. In terms of job roles, I’ve moved around quite a bit but in similar kinds of areas with a focus on not for profit, and I’m now in a Customer Insight […]

5 key trends affecting technical training provision in predictive analytics

I have been in the business of delivering software applications and solutions that have advanced and predictive analytics at their core, in one form or another, for last 25 years. In a recent conversation with colleagues we were discussing how the wider market has changed and, more specifically, how this has affected the way that people who analyse data, create predictive models and help their organisations use them go about developing their skills and keeping them up to date. It feels to us that the dramatic developments in the availability and use of technology have had an equivalent impact on […]

News

Statistics in court: the story of a dataset

Like a lot of consultants working in the analytics industry, I’ve built up an extensive portfolio of materials to illustrate… Read More

6 secrets of building better models part one: bootstrap aggregation

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to… Read More

6 secrets of building better models part two: boosting

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to… Read More