What’s new in IBM SPSS Statistics v27?

In June of this year, IBM released the latest version of SPSS Statistics. Version 27 introduces several additional analysis procedures as well as new system enhancements. In this report we take a tour of some of the most valuable improvements that have been made. There’s a video of this tour here as well. Bootstrapping and Data Preparation are now standard functionality One of the biggest changes in this release is that the Bootstrapping and Data Preparation modules are now included with SPSS Statistics base, meaning that they are now part of the standard functionality of the package. Bootstrapping is a […]

Getting started with SPSS Syntax part one

SPSS Syntax has long been exploited by expert analysts due to its flexibility, power and ease of learning. Syntax vastly increases users’ productivity by making it easier to: Fix problems without re-doing everything Carry out repetitive tasks more efficiently Allow analysts to share their work with others Provide a recorded audit trail for better governance Automatically generate results and create scheduled batch jobs In this series of videos we show you how to get to grips with SPSS syntax covering: The basics of SPSS Syntax Easy ways to find help about how syntax works Useful tips and little-known procedures Automating […]

Getting started with SPSS syntax part two

SPSS Syntax has long been exploited by expert analysts due to its flexibility, power and ease of learning. Syntax vastly increases users’ productivity by making it easier to: Fix problems without re-doing everything Carry out repetitive tasks more efficiently Allow analysts to share their work with others Provide a recorded audit trail for better governance Automatically generate results and create scheduled batch jobs In this series of videos we show you how to get to grips with SPSS syntax covering: The basics of SPSS Syntax Easy ways to find help about how syntax works Useful tips and little-known procedures Automating […]

Getting started with SPSS syntax part three

SPSS Syntax has long been exploited by expert analysts due to its flexibility, power and ease of learning. Syntax vastly increases users’ productivity by making it easier to: Fix problems without re-doing everything Carry out repetitive tasks more efficiently Allow analysts to share their work with others Provide a recorded audit trail for better governance Automatically generate results and create scheduled batch jobs In this series of videos we show you how to get to grips with SPSS syntax covering: The basics of SPSS Syntax Easy ways to find help about how syntax works Useful tips and little-known procedures Automating […]

Getting started with SPSS syntax part four

SPSS Syntax has long been exploited by expert analysts due to its flexibility, power and ease of learning. Syntax vastly increases users’ productivity by making it easier to: Fix problems without re-doing everything Carry out repetitive tasks more efficiently Allow analysts to share their work with others Provide a recorded audit trail for better governance Automatically generate results and create scheduled batch jobs In this series of videos we show you how to get to grips with SPSS syntax covering: The basics of SPSS Syntax Easy ways to find help about how syntax works Useful tips and little-known procedures Automating […]

Getting started with SPSS syntax part five

SPSS Syntax has long been exploited by expert analysts due to its flexibility, power and ease of learning. Syntax vastly increases users’ productivity by making it easier to: Fix problems without re-doing everything Carry out repetitive tasks more efficiently Allow analysts to share their work with others Provide a recorded audit trail for better governance Automatically generate results and create scheduled batch jobs In this series of videos we show you how to get to grips with SPSS syntax covering: The basics of SPSS Syntax Easy ways to find help about how syntax works Useful tips and little-known procedures Automating […]

PS Imago Pro – an overview of additional charting capabilities not available in IBM SPSS Statistics

PS Imago Pro is a statistical analysis and reporting solution based on IBM SPSS Statistics. Indeed, apart from the inclusion of an additional menu, users of SPSS Statistics may find that the data analysis module of PS Imago Pro looks almost identical to its SPSS counterpart. However, as Figure 1 shows, this additional menu contains an extensive array of charting capabilities that are not available in SPSS Statistics. In this we can explore examples of these graphing enhancements that aren’t available through the standard SPSS Statistics offering. Combining Tabulation and Charting Each of the charts within PS Imago Pro can […]

Are Log scales appropriate for COVID-19 Charts?

You may have noticed that many media outlets are illustrating the tragic course of the coronavirus (COVID-19) pandemic by employing charts with log scales. These log values refer of course to the mathematical concept of logarithms. This is something that most of us learn about in school when we are taught that a logarithm is the power to which a number must be raised in order to get some other number. If, for example, we raise the value 10 to the power of 2, we get 100. Therefore the ‘base 10’ log of 100 is 2. If this all seems rather […]

An overview of the four main approaches to predictive analytics

This infographic provides an overview of the four main families of approaches to predictive analytics. Prediction encompasses applications that aim to estimate or predict the values of a key target field. Segmentation refers to techniques such as cluster analysis which attempt to find the most ‘naturally occurring” groups within a dataset. Association modelling discovers groups of categories that are likely to co-occur such as items in a shopping basket. Forecasting refers to methods that extrapolate time trends such as sales of products so businesses can anticipate future demand. Each family is further detailed with colour coded examples of the popular […]

Choosing a predictive analytics project

At Smart Vision we’re in a pretty strong position to talk authoritatively about the reality of predictive analytics. That’s because we’re comprised of a team of veteran practitioners with decades of experience where we’ve all witnessed plenty of success stories but also one or two ‘data science’ train wrecks. Moreover, like anyone else, we’re exposed to the seemingly constant torrent of stories about the latest developments in machine learning, data science or AI. But we’re often struck by the fact that there seems to be such a focus on emphasising the power of analytics or on explaining how machine learning […]

Statistics in court: the story of a dataset

Like a lot of consultants working in the analytics industry, I’ve built up an extensive portfolio of materials to illustrate different kinds of applications and approaches. Some of these consist of files and slide decks used to explain quite esoteric procedures such as TURF analysis or Partial Least Squares. However, there are certain materials that can be used to demonstrate such a wide number of statistical and predictive analytics techniques, that I’ve found myself immediately reaching for them again and again over the years. One of these is the SPSS Statistics sample dataset ‘Employee data.sav’. Most statistical software programs come […]

6 secrets of building better models part one: bootstrap aggregation

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part two: boosting

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part three: feature engineering

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part four: ensemble modelling

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part five: meta models

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

6 secrets of building better models part six: split models

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples […]

What’s new in IBM SPSS Statistics v26?

In April of this year, IBM released the latest version of SPSS Statistics. Version 26 introduces a number of additional analysis procedures as well as new command enhancements. If you’re an existing SPSS user and you’d like to upgrade to v26 there’s more information about how to do that here. If you’re interested in trying SPSS Statistics for the first time then do please get in touch – we’ll be happy to help.  New analytical procedures Quantile Regression In standard ‘least squares’ regression the model predictions are based on a single regression line. This line can be used to estimate the […]

Regular Expressions for IBM SPSS Modeler: performance comparison

The Regular Expressions for IBM SPSS Modeler node pack provides 4 nodes that integrate the power and flexibility of regular expression pattern matching into SPSS Modeler. However, some of these capabilities can be supported using the extension nodes built into SPSS Modeler and that begs the question – why buy the Regular Expression nodes? One obvious answer is ease of use. The extension nodes built into SPSS Modeler require expertise in either R or Python programming languages since they are general “code” nodes. Although many data scientists may already have that expertise, most people use SPSS Modeler because of its […]

A first look at SPSS Modeler v18.2

In this video Jarlath Quinn takes a first look at SPSS Modeler v18.2 and demonstrates some of the new functionality that’s included within this release. IBM® SPSS® Modeler adds the following features in this release. New look and feel. A new modern interface theme is available via Tools > User Options > Display. For instructions on switching to the new theme. New data views. You can now right-click a data node and select View Data to examine and refine your data in new ways with advanced data visualizations. IBM Data Warehouse. Database modeling with IBM Netezza Analytics now supports IBM Data Warehouse. Gaussian Mixture node. A new Gaussian Mixture node is available on […]

Three questions to ask when reading articles about artificial intelligence

You may have noticed by now that there seem to be a couple of recurring themes in the plethora of articles and news programmes about artificial intelligence (AI). These themes can be summed up as a) “The dangers of AI” and b) “The limitations of AI”. Articles addressing the dangers of AI tend to focus on issues such as the threat of widespread job losses to AI, the possibility of inherent bias (such as racism and sexism), the lack of transparency in decisions made by AI systems and, as a result, the inability to plead your case with AI (“Computer […]

How to change the appearance of your output in SPSS Statistics

We’re often asked how you can change the appearance of the tables that SPSS generates as output. In this video Jarlath Quinn demonstrates two different ways to do this, either by choosing a different table look in the edit / options function, or by editing the table properties directly yourself.

How to merge files in SPSS Statistics

In this video Jarlath Quinn demonstrates how to merge data files within SPSS Statistics using each of the two main methods, either adding cases (combining files with the same fields but additional rows) or adding variables (combining files by joining variables to a target file using something like an ID field as a ‘keyed variable’).

How to create grouped or banded variables in SPSS Statistics

SPSS users often want to be able to create grouped or banded data from continuous fields such as, for example, creating age groups or income bands from continuous fields. In this video Jarlath Quinn demonstrates how to use the visual binning procedure within SPSS Statistics to do this including how to control the proportion of cases that fall into each band and how to automatically create value labels.

How to recode your data in SPSS Statistics

Recoding your data means changing the values of a variable so that they represent something else. Within SPSS Statistics there is more than one type of recode that can be performed. In this video Jarlath Quinn demonstrates how to:- Recode into the same variables, overwriting an existing variable Recode into different variables, creating a new variable in addition to your existing variables Automatically recode, a particular procedure designed to change string codes into numeric codes Visual binning, visualising a distribution in the form of a histogram and slicing it into ranged categories

How to check your data for normality in SPSS Statistics

When you’re deciding which tests to run on your data it’s important to understand whether your data is normally distributed or not, as a lot of standard parametrical tests assume a normal distribution whereas other non-parametric tests are designed to be run on data which is not normally distributed. A normal distribution has a number of characteristics:- It is symmetrical It is bell-shaped Its mean, median and mode all appear at the same place Normal distributions can be divided up into the same proportions by the standard deviations, so 95% of the area under the curve lies within roughly plus […]

How to calculate with dates in SPSS Statistics

In this video Jarlath Quinn demonstrates how to work with date and time variables in SPSS using the SPSS date and time wizard. This enables you to:- Calculate time units between two dates Add / subtract time units to or from dates Extract part of a date or a time, such as days of the week or months of the year Create date or time variables from variables holding part of dates or times

How to select cases in SPSS Statistics

In this video Jarlath Quinn demonstrates how to use SPSS Statistics to define data filters in order to select particular cases for analysis. This can be done either to create a temporary selection or to create a permanent new file with only a subsection of cases included within it. The video demonstrates how to do this with string variables too, as well as how to combine conditions from multiple variables in your selection.

How to reverse a scale in SPSS Statistics

In this video Jarlath Quinn demonstrates how to reverse the values of a rating scale (such as an agreement scale or a satisfaction scale) in SPSS Statistics, so that the highest value becomes the lowest value and vice versa. Jarlath shows two methods of doing this – one using the compute procedure and the other using the recode procedure.

How to combine variables in SPSS Statistics

SPSS users often want to know how they can combine variables together. In this video Jarlath Quinn demonstrates how to use the compute procedure to calculate the mean of a number of variables to create one combined variable, and also how to use the count values procedure to count how many times a particular value occurs across a series of variables in order to create an overall count.