Expert insight: Clifford Budge, Head of Ecommerce Data Science, major toy manufacturer

Lorna: Please could you start just by talking a little bit about your own background and how you’ve come to be working in data?

Cliff: I went to university to do a mapping science degree, during which time I had a year out in industry. I worked as a surveyor for a company that was laying the cables for the first optical fibre telecommunications network in the UK. Within about six weeks of starting the job, I got access to the marketing data and thought I could use my mapping skills to create a visual representation of the sales and prospect data to measure geographical penetration rates. This had never been seen before in the business. Suddenly, the marketing team realised that this was a lot more valuable to them than having me going out doing surveying so they turned me into their full-time marketing analyst, and that’s really where it all started from.

I’ve been working in this field ever since and thoroughly enjoy working in Data Sciences. I particularly like the challenge of playing with large data sets and, over time, the data sets have got bigger. I remember a long time ago going for an interview in a financial services company and being asked “How do you feel about working with large databases?” “Well, what do you class as a large database?” “Quarter of a million records.” I smiled to myself because at the time I was working already with something in the region of 35 million records.

Lorna: How long have you been at in your current role at a major toy manufacturer? Can you tell me a bit about what your role is?

Cliff: Yes, certainly. I’ve been working for the company for seven and a half years and my current title is Head of E-Commerce Data Science. The main area that I’m working on at the moment is everything to do with return on investment – econometric modelling. Where does the department invest its digital marketing budget?  How much money do we actually make back? Which channels are working for us, which ones aren’t?

That sounds quite obvious and straightforward but it’s actually quite complex, because you have a lot of different data sets from multiple sources and systems. All of this needs to be collated and turned into meaningful business intelligence that can inform decision-making.

Lorna: So your work is primarily consumer facing?

Cliff: Consumer has a specific meaning within the company: it is the person who uses the product, which could be either a child or an adult. This differs from the shopper, who is the person that’s buying the products, and my role is direct to shopper.  This differs again from the customer who is the wholesale buyer of the toys.

Lorna: I suppose that might throw up a challenge, just as you were saying earlier. With lots of different data sets, making sure that you’ve got a consistent set of terminology so that everybody understands what a customer means and everybody understands what a consumer is and those kinds of things.

Cliff: Yes, it could be a challenge but this understanding has evolved and improved during my time at the company. An example of another term that can be interpreted in many different ways is lifetime value. It is used widely but not always applied correctly.

To calculate it correctly, the business needs reliable data sources and skilled data scientists to audit that data and create logical business measurements.Then, the results need to be presented in the right business context to ensure that the messages are understood by the business units that are then responsible for putting plans into action to support the business strategy.

There’s so much data available in the world today, so one of our biggest challenges is to decide what data we need to capture in order to optimise the quality of the analysis. So it’s really important to make sure that the people who understand both the business and the data are the ones who make the decisions on what needs to be captured.

To illustrate, websites are a rich source of data but the value gained from using analytics tools to summarise the content can be compromised if the initial data not relevant

Lorna: Do you see a tension between the IT team, the database people and the end users and the business people in terms of who controls what and who defines how things should be?

Cliff: I think that’s an interesting question because there are many different ways to define the roles and responsibilities around data science. Speaking for myself, I’ve probably had seven or eight different jobs over the years and worked in several different companies. Sometimes I’ve sat in IT, sometimes I’ve sat in marketing. In some companies I’ve actually sat across both areas.

One of the key challenges is setting up an organisational structure in a way that maximises the contribution of everyone involved, according to their different roles within the company. In my experience, this works best when the analytics function has a deep understanding of the business, rather than purely technical skills, and is well-positioned to maintain regular two-way communications about what is required and what is producing good business performance. Furthermore, placing responsibility for data ownership with the business units is best practice with the new customer data platform (CDP) technologies.

Lorna: What changes have you seen in the analytics space over the years?

Cliff: The technology is always evolving. The servers get bigger and move from physical to virtual and the software packages and programming languages are always evolving too. Many universities are also moving away from the traditional programming analytical packages, focusing instead on R. Some of the reasons this has happened is that R is an open source free platform whereas the traditional packages can be very expensive in comparison.

There has also been a change in data frequency from weekly database refreshes being the norm to real time data. In the past, you would have to go into a database, pull out the data cube required, then use the analytical software to investigate the data, then apply these learnings into your CRM system, which could take time.

Real time data systems becomes very powerful if your technology platforms can seamlessly use it and apply learnings back to the shopper quickly and effectively. This will have the highest return on investment when linked to e-commerce because websites can be automated to make decisions and when you can introduce business rules too it will provide an even better experience on your website for your shoppers, and that’s really where the technology investment starts give great returns.

With all real time decision systems, there is an element of system automation, which is fine, but it is still important for companies to have skilled and technically trained experts that thoroughly understand your business data. They will need to track what the automatic systems are doing. Some automatic systems have done all sorts of weird and wonderful things that aren’t really relevant and could damage your business.

Take IBM Modeler for example, it is a fantastic platform. I’ve used it for many years to design complex models from scratch, including all the data cleaning. Its interface is so easy to use, but easy to use can also be dangerous, because you can do the wrong thing very easily without realising that you’ve made a mistake. This is why training and a certain level of data understanding is paramount.

Lorna: If you set up models and just let them run, you obviously need to check that the data that you used is correct. If you’ve got something like a simple error in a formula that’s wrong at the very start of the process that people don’t notice, that can have a knock on effect that could be pretty serious.

Cliff: Yes it goes back to my second job after university. I was working in a dotcom company, one of the biggest at the time. I did some analysis for the figures that were used to help with the share price for the business. I handed over my values and the client came back to me to tell me that they cannot be correct as they were different from figures supplied previously. As a new analyst, I asked a colleague to take a look. He said, “No, you have done it right.” On further inspection, the legacy process had not been calculating the correct figures as it was wrong but had never been spotted.

You can get the right tools and the right data but if you get people pressing the wrong buttons, you will get the wrong answer. It doesn’t necessarily get spotted straight away but it can have a big impact on your business because you could invest heavily based on incorrect insights, impacting market share and return on investment.

Lorna: The insight that comes is flawed but people don’t realise until it’s too late.

Cliff: Exactly. It’s also about asking the right questions in the first place. We talked earlier about lifetime value. If a brand new business invested into creating lifetime value, it would be a waste of resources to a certain extent, because realistically you need a reasonable amount of historical data for the model to work. The standard lifetime value calculation is based on five-year period. This would mean you need six years’ worth of good quality data capture to really help you understand lifetime value within the target market place. For a new start up, churn would be a more critical figure to understand in the short term.

Lorna: You have touched on some of the changes in the analytics space, but I’m wondering if there are any other changes that you’ve seen over the course of your career that are relevant in analytics?

Cliff:  What I have noticed in my time working in the industry is that when a new product comes out that is particularly good, when the product starts getting noticed you tend to find that one of the big players will buy out the technology and either rebrand it or incorporate it into their own platforms. This has happened many times with Clementine, Business Objects and Intrinsic to name a few.

My personal view is there are a lot of really awesome lower to middle cost platforms out there that are also really cost effective. One of my core areas of expertise is single customer view systems or customer data platforms ‘CDP’. That’s where all the customer data is brought together into one location and that’s used to enhance the shopper experience.

There are many companies building databases that are very quick analysis platforms. You’re able to put large amounts of data into them. They allow you to analyse your data, you get quick answers and these CDP technologies are evolving at amazing rate.

Older systems could take over 24 hours to refresh the customer database. In a modern marketing world, this is too slow for multi-phase campaigns over a critical period like Black Friday to Cyber Monday, 24 hours is too long to wait. Many of these refresh issues can be solved with intelligent database schema designs. Identifying what data needs to be real time and is critically important for the customer journey and what data is not.

Lorna: I’m interested what advice you might give to someone thinking of working in analytics and what skills you look for when you’re recruiting.

Cliff: I think when people want to start to work in the analytics space it is important to have an idea of what technical area they would like to work in as analytics has many different disciplines. Once you know what really interests you then go in and focus on that.

It’s also important that people have a realistic understanding of what the work involves, in practical terms. Segmentation models or next best action models are cool. The reality is that you’ll generally spend 70% of your time on any project doing data preparation. Only 30% is the fun bit, doing the statistical models.

When recruiting I’m looking for people who know what they want to do and who are prepared to be hands-on as well. I managed a team for 10 years. People need to be a team player with logical thinking, are prepared to work hard and have confidence in their skills to identify the insights that drive the business forward.