Finding normality – why is the normal distribution so important when we so rarely encounter it in real life?

This is the fourth post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of the normal distribution.  One of the first …

Finding normality – why is the normal distribution so important when we so rarely encounter it in real life? Read More »

mathematics computation

Are algorithms evil?

None of us can have failed to notice the recent debacle over Ofqual’s (the Office of Qualifications and Examinations Regulation) use of an algorithm to predict pupil grades. Once again, ‘algorithm fever’ has generated a flurry of news articles questioning whether we are sleepwalking into a dystopian future where human expert decision-making is replaced with …

Are algorithms evil? Read More »

Testing versus inferring

This is the second post in our ‘eat your greens’ series – a back to basics look at some of the core concepts of statistics and analytics that, in our experience, are frequently misunderstood or misapplied. In this post we’ll look in more depth at the concept of testing versus inferring. One of most daunting …

Testing versus inferring Read More »

PS Imago Pro – an overview of additional charting capabilities not available in IBM SPSS Statistics

PS Imago Pro is a statistical analysis and reporting solution based on IBM SPSS Statistics. Indeed, apart from the inclusion of an additional menu, users of SPSS Statistics may find that the data analysis module of PS Imago Pro looks almost identical to its SPSS counterpart. However, as Figure 1 shows, this additional menu contains …

PS Imago Pro – an overview of additional charting capabilities not available in IBM SPSS Statistics Read More »

Using the find all() Function to search for nodes

Most SPSS Modeler scripts include code that locates an existing node e.g.: stream = modeler.script.stream() typenode = stream.findByType(“type”, None) However, some scripts need to search for all nodes – maybe by node type but also matching some other criteria. The Modeler scripting API documentation (PDF) mentions a findAll() function: d.findAll(filter, recursive): Collection filter (NodeFilter) : the node filter recursive …

Using the find all() Function to search for nodes Read More »

An overview of the four main approaches to predictive analytics

This infographic provides an overview of the four main families of approaches to predictive analytics. Prediction encompasses applications that aim to estimate or predict the values of a key target field. Segmentation refers to techniques such as cluster analysis which attempt to find the most ‘naturally occurring” groups within a dataset. Association modelling discovers groups …

An overview of the four main approaches to predictive analytics Read More »

Using SPSS Modeler’s cache_compression setting to speed up your modelling

There are a number of configuration settings associated with IBM SPSS Modeler Server that control its behaviour. The default settings aim to ensure that stream execution will complete successfully even if the host machine is being used by a number of other applications i.e. Modeler Server is trying to be a “good citizen”. However, if …

Using SPSS Modeler’s cache_compression setting to speed up your modelling Read More »

Expert insight: Inna Yordanova, Senior Researcher at IPSE

Inna Yordanova is Senior Researcher at IPSE – the Association of Independent Professionals and the Self-Employed. IPSE is the largest organisation representing freelancers and self-employed people. Its 78,000 members are freelancers, contractors, consultants and other self-employed people from all sectors of the economy. Lorna: Can you start off then just by telling me a little …

Expert insight: Inna Yordanova, Senior Researcher at IPSE Read More »

Expert insight – Adam O’Shaughnessy, SBD Automotive

Through independent research, evaluation and strategic consulting support, SBD Automotive helps vehicle manufacturers and their partners create autonomous, more secure and better connected cars. Adam O’Shaughnessy is a Senior Specialist – Connected Car.  Lorna: Can you tell me a little bit about your background and how you’ve come to be working with data and statistics …

Expert insight – Adam O’Shaughnessy, SBD Automotive Read More »

Expert insight: Rob Woods, Analytics Solution Architect/Data Scientist, Watson FSS

Rob Woods has worked in the analytics industry for over 20 years. He’s currently an Analytics Solutions Architect working on IBM’s Watson FSS suite of products. All views here are his own and not IBM’s. Describe your own background and how you came to be working with statistics / data / analytics I started as …

Expert insight: Rob Woods, Analytics Solution Architect/Data Scientist, Watson FSS Read More »

6 secrets of building better models part three: feature engineering

Feature Engineering is really just a fancy term for creating new data. Very often we can help an algorithm build better models by preparing the input data in a way that allows it to detect a clearer signal in the often noisy data. In machine learning variables are often referred to as ‘features’, so feature engineering refers to the transformation of variables or the creation of new ones.

6 secrets of building better models part five: meta models

The idea of meta modelling is to build a predictive model using the predictions or scores generated by another model. By adding the predictive scores generated by an initial modelling algorithm to an existing pool of predictor fields, a second algorithm can then exploit these scores in to build a final more accurate model.

6 secrets of building better models part six: split models

Split models or split population modelling is another technique that allows the user to build multiple models which can then be combined to create a single prediction. The idea with split modelling is that if the data represent different populations or contain separate groups that behave in very different ways, assuming that a single model can explain all the inherent variability across these distinct populations might be unreasonable.

Regular Expressions for IBM SPSS Modeler: performance comparison

The Regular Expressions for IBM SPSS Modeler node pack provides 4 nodes that integrate the power and flexibility of regular expression pattern matching into SPSS Modeler. However, some of these capabilities can be supported using the extension nodes built into SPSS Modeler and that begs the question – why buy the Regular Expression nodes? One …

Regular Expressions for IBM SPSS Modeler: performance comparison Read More »

Thinking of using spreadsheets for advanced analytics? Think again.

When we’re talking to potential clients about advanced analytics we often ask them what tools they’re currently using. More often than not they say they’re using spreadsheets. Spreadsheets are one of the most widely used tools for statistical analysis and of course, most businesses couldn’t run without them. However, when it comes to advanced analytics …

Thinking of using spreadsheets for advanced analytics? Think again. Read More »

Expert insight – Paul Jackson, Head of Advanced Analytics, Bonamy Finch

Can you tell us a bit about yourself, your background and how you came to be where you are career-wise? I started my career in 2001 at Research International’s Marketing Science Centre, following a degree in Sociology and Social Policy. Back then I was working mainly on analysis of market research surveys. Over time I …

Expert insight – Paul Jackson, Head of Advanced Analytics, Bonamy Finch Read More »

How to merge files in SPSS Statistics

In this video Jarlath Quinn demonstrates how to merge data files within SPSS Statistics using each of the two main methods, either adding cases (combining files with the same fields but additional rows) or adding variables (combining files by joining variables to a target file using something like an ID field as a ‘keyed variable’).

How to select cases in SPSS Statistics

In this video Jarlath Quinn demonstrates how to use SPSS Statistics to define data filters in order to select particular cases for analysis. This can be done either to create a temporary selection or to create a permanent new file with only a subsection of cases included within it.

How to reverse a scale in SPSS Statistics

In this video Jarlath Quinn demonstrates how to reverse the values of a rating scale (such as an agreement scale or a satisfaction scale) in SPSS Statistics, so that the highest value becomes the lowest value and vice versa.

Download your free copy of our Understanding Significance Testing white paper
Subscribe to our email newsletter today to receive updates on the latest news, tutorials and events, and get your free copy of our latest white paper.
We respect your privacy. Your information is safe and will never be shared.
Don't miss out. Subscribe today.
×
×
WordPress Popup Plugin
Scroll to Top