How will the GDPR affect big data analytics?

With less than a year to go until it comes into effect, organisations are really starting to get to grips with what GDPR will mean in practice. We’ve talked to lots of customers who are concerned about the implications that GDPR might have for the way in which they collect and analyse customer data. Much of what constitutes ‘big data’ is personal data and the use of this kind of data definitely does have implications for data protection, privacy and individuals’ associated rights. And these rights are going to be strengthened by GDPR. So does this spell trouble for big […]

Thinking of hiring a data analyst? What skills should they have?

Many of our clients regularly hire new analysts and we’re often involved in discussions about what the core skills are that they should be looking for. Similarly, I often talk to people looking to build a career in analytics who want to know what skills they need to develop. The most skilled analysts are in high demand because they blend together a range of skills that are rarely found in a single person. Here are the things that I think are really key. Domain knowledge about your industry It’s not enough just to have the technical skills. As we have […]

Do I need SPSS Statistics or Modeler? How to choose the right product for your needs

We often talk to people who are unsure whether they need SPSS Statistics or whether SPSS Modeler might be more suited to their needs. In fact, it’s not always a clear cut choice as to which tool is more appropriate as it depends on the context in which the technology might be used. With that in mind I thought it might be helpful to develop a little infographic to lay out the sorts of things that you should be thinking about when choosing between SPSS Modeler and SPSS Statistics. We can think of the choice as a sort of continuum, […]

Fear and loathing in machine learning

Over the past two years I’ve noticed a steady stream of articles in the mainstream press and business journals centred on the themes of a) the dangers of machine learning 1 2 or b) the limitations of machine learning 3 4. Many of these articles refer to incidents where machine learning initiatives have echoed and exasperated our own biases, prejudices and (frankly racist) behaviours 5. Others have focused on their limitations with providing the sorts of ‘informed, idiosyncratic’ recommendations that humans find effortless. However, for those of us that work in the field of predictive analytics where many of the […]

Take your asset management to the next level with predictive asset management

Why is effective asset management so important? The rapidly changing and volatile global economy is putting enormous pressure on organisations to control or preferably reduce their operating costs. This is resulting in a period of uncertainty with a range of new and complex factors having to be considered. For example: Ageing infrastructure and assets, more demanding operating conditions and higher throughputs Stricter regulatory requirements and much higher penalties for not meeting them The public’s increasing awareness of and care for the environment Increasing global demand for water and other natural resources Shifting economic and political power balances Investors’ attitudes to […]

What do we mean when we talk about data modelling? An overview of different types of models

The real world, whether it be the physical world, for example machines, or the natural world, for example human and animal behaviour, is very complex with many factors, some unknown, determining their behaviour and responses to interventions. Even if every contributory factor to a phenomenon is known, it is unrealistic to expect that the unique contribution of each factor to the phenomenon can be isolated and quantified. Thus, mathematical models are simplified representations of reality, but to be useful they must give realistic results and reveal meaningful insights. In his 1976 paper ‘Science and Statistics’ in the Journal of the […]

Which data science tools should you learn?

I’ve blogged several times now about different aspects of data science. A conversation I’ve been having more and more frequently now is about what tools people should learn if they’re hoping to develop a career in data science. Obviously there are many different factors to be taken into account here. You’ll want to think about whether there’s a tool that’s the standard in your particular industry. You’ll also want to consider whether you want to specialize in a particular area of data science and build a reputation as an expert in a range of related tools, or whether you’d prefer […]

How alternative interfaces can help you get more out of R

Contemporary analytical platforms like SPSS and SAS represent the some of the earliest and yet longest-lived examples of proprietary software in the industry. When we think of the tectonic shifts the technology landscape has witnessed in last four decades, through the mainframe era, the rise of the PC, browser wars, the dotcom bubble, the smartphone revolution to the age of the cloud and big data not to mention the number of once seemingly ubiquitous software tools that no longer dominate the marketplace, it’s incredible to think that the first versions of SPSS and SAS were developed as far back as […]

Why R can be hard to learn

Many of the analysts we speak to are being pushed over to R, primarily because it’s open source and therefore a free alternative to commercial data analytics packages for which the costs can sometimes run into tens of thousands of pounds (or more). However, even experienced analysts often find that getting to grips with R can be a difficult business. Many people view R as being notoriously difficult to learn. There are a number of reasons why this is the case. Lack of consistency In some ways the open source nature of R is its biggest weakness as well as […]

Six questions to ask before you opt for open source software

It’s not uncommon for people to say to us that they don’t understand why they should pay for industry standard analytics products like SPSS or SAS when there are strong open source alternatives freely available such as R. Indeed the development of R has really transformed the analytics marketplace in many ways. It’s tempting to make a comparison between R and commercial alternatives such as SPSS and SAS on price grounds alone. When you look at it that way it might seem as though there’s no contest. SPSS and SAS can both involve a significant investment whereas R is free. […]

What’s the difference between business intelligence and predictive analytics?

It’s not uncommon to talk to potential clients who consider themselves to already be very much data-driven in the way that they operate. However it’s very rare to find a potential client that truly is exploiting the full potential of the data that they hold. That’s because companies often confuse business intelligence with predictive analytics, or think that once they’re using their data for business intelligence that they’re doing all they can to get value from it. Neither of these things is true. Predictive analytics is not the same thing as business intelligence, and if you’re just using your data […]

How repeatable application templates will maximise the effectiveness of your first predictive analytics project

As we help our clients get up and running with the predictive analytics tools and skills they need, we see some trends emerging in terms of the kind of applications for which clients tend to use predictive analytics most commonly. These are what we call ‘repeatable application templates’. In my previous post I outlined the 4 reasons  why we believe predictive analytics is a low risk, high return way for many companies to achieve competitive advantage. I have re-capped these below for reference: Implementing predictive analytics is less expensive, quicker and lower risk than almost any other kind of technology-enabled project You already have […]

Data science is everywhere, so why no data scientists to be seen?

Data science is everywhere at the moment. Nearly as everywhere as big data, but not quite. Books out there are making the concepts behind statistics and predictive analytics more and more accessible not only to those in business making decisions everyday but also to the average man or woman on the street.  Try Super Crunchers by Ian Ayres, Moneyball (the book  or the film which has the advantage of featuring Brad Pitt and therefore making the business of statistics much sexier than it has been),Freakonomics or the newer Superfreakonomics or pretty much anything by Malcolm Gladwell. All of these books have […]

The A-Z of analytics with IBM SPSS Modeler

A is for Automation  Why bother trying out loads of modelling techniques to see which one works best when Modeler can do that for you? Modeler can test many permutations of the same algorithm and multiple instances of different methods before selecting the best performers according to a pre-specified criteria. Oh and it will also automatically prepare your data so you can get the best results from your analysis. B is for Boosting and Bagging Boosting is a key technique in Modeler that can generate more accurate models. It works by building the same model multiple times but each time […]