I talked in my last blog post about the confusion that often emerges around how much data is enough to effectively deploy predictive analytics. I argued that sample selection is much more important than sample size when it comes to ensuring accurate results. As an example I talked about two political polls from the 1936 US presidential election. The Literary Digest used a large (2.4 million) but heavily biased sample and got the prediction badly wrong. George Gallup, by comparison, got to within 1% of the actual election result using a much smaller sample (only 50,000) but that was much more representative. Sometimes, when it comes to predictive power, less is more.
Having considered the value of data samples (or ‘data extracts’, as we often call samples when talking with clients) and their impact on prediction, I wanted to expand on the next and crucial step in the process. What is to be done with the results? How should they be deployed?
There is always a point at which a carefully developed and useful predictive model and its outputs run up against the practical realities of how they are to be used in day-to-day business operations. We think about this as deployment, a theme we have visited before in this blog. Planning for deployment is as important as planning your data sample or extract.
Organisations that are using predictive analytics will need a scalable technology platform to effectively support, manage and automate the predictive models that are being used to inform organisational decision making. In the deployment stage of a predictive analytics project the actual mode of deployment can take many forms and different organisations will use different deployment options, depending on their situation. A few common forms of deployment are as follows:
- A list ranked by the likelihood of an individual to respond to a campaign (a propensity score) that is then given to the marketing team to inform campaign execution.
- A risk score which indicates how likely each customer is to default on a loan or similar financial commitment, derived from data captured during the loan application process.
- A suggestion of a next best action to take or recommendation of what information to present to a customer next based on their recent online activity and transactions, to be written to a field in the data warehouse.
- A prompt or recommendation for a customer service agent (CSA), delivered to their screen, based on the specific details of the call that the CSA is currently handling.
In short the specific function of an analytical output and the way it will be used can vary significantly dependent on the operational context. But whatever the operational backdrop, this derived value and the insight it embodies, will need to be available at the individual level and across an entire customer base and often will be required to activate mid-transaction with a customer or prospect. Often it is this step that requires integration with ‘big data’ infrastructure and other complex operational systems.
Consider also that an organisation, certainly a larger organisation, is likely to have a very wide array of models running, supporting many different decision activities. In this context the use and management of the models must be treated with the same respect and governance afforded to any other organisational asset. The nuggets of predictive insight, developed from a carefully considered data extract, need to be effectively deployed and actively monitored and managed in order that the benefits can be fully realised.
A truly valuable predictive model is almost always based on an intelligently structured sample taken from the data population, suitable for the specific question being addressed. When the output of that model is deployed it needs to be done in a way that enables your organisation to use it efficiently, effectively and potentially intensively whilst keeping a close eye on its ongoing performance.