Data Science, Machine Learning & Churn Rate
Monetising data, within clear privacy and compliance regulations, is a quintessential action for every company that wants to be, or keep being relevant in their market. Analysing data generated by every-day company activities might be one of the least expensive ways to probe for inefficiencies and check for performance issues. In fact, leveraging data and acting upon it should be a procedure with which companies are familiar if they want to keep ahead of the competition. All of this is, most likely, a given fact to anyone working with data. It might not be as clear to someone not working in the field, but a quick count of how many articles Forbes or Harvard Business Review produced on the topic of monetising data should be proof of how the concept is an acquired trend within the corporate landscape.
In this blog post, we will take a look at a specific use case of how data, combined with Data Science (DS) techniques and Machine Learning (ML) algorithms, can help a company management team better understand their business and their customers’ behaviour, by being equipped with more and better information in their decision-making process. As the reader may already have worked out by the title, the scope of this blog post will be customer churn and how data produced in-house can evaluate and detect it.
Let’s first define what costumer churn is, how it is measured, why it may occur and its impact in a company.
According to Investopedia, customer churn rate is the rate at which customers stop doing business with an entity. It is most commonly expressed as the percentage of service subscribers who discontinue their subscriptions within a given time period. Most obviously we can associate customer churn to a subscription-based model (SaaS), where the customer stops paying the recurring subscription. However, the concept can be also applied to one-off payment-based models too, i.e. when a regular customer stops purchasing from a particular shop.
One can elaborate on a number of obvious reasons for why clients churn: a drastic modification to a client’s financial situation, better competing products, poor costumer experience (CX) or unfulfilled client expectations. Taking the current pandemic into consideration, another easily identifiable cause may be the lack of online services during the lockdown which most countries experienced this year.
This said, it is obvious that a lower customer churn rate will probably benefit every party involved in a transaction: the company will see its profits grow and customer satisfaction will be higher. It is also well known how expensive it can be to get new clients, which is another strong reason to keep churn as low as possible.
Let’s consider a simple example. The company Xyz, which has a monthly subscription-based business model, has 5000 recurrent customers. Xyz considers a client to be recurrent if they conduct one consecutive monthly consecutive transaction. Over the past month, the company registered 125 canceled subscriptions, i.e. a 2.5% churn rate.
Looking into high churn rates, there are a few questions that management teams may want to answer. Some of the most obvious would be: What was the trigger event? and What is the typical churning client profile? The use of advanced analytics in this field produced interesting insights. As Bain & Company state, churn results from a series of episodes over time, not just one or two specific triggers. This conclusion makes us want to add one more obvious question: What are the specific root causes? or What is the archetypal series of events that result in customer churn?
Data, collected over time, such as revenue information, transactions, contract state and even demographic information for the client most likely have clues to the reasons behind a churn event. By analysing these data combined, ML algorithms provide an unbiased interpretation and may be able to distinguish behavioural patterns related to such events, undetectable to product owners and even people with savvy business acumen.
Machine Learning is a field of computer science that uses advanced mathematical models trained to identify patterns and predict events. These models learn to do such tasks according to the data they have seen.
Specific algorithmic approaches are able to translate these patterns into interpretable scores or insights, in order to answer relevant questions such as those mentioned above. For instance, it would be possible to compute the churn probability for every client based on data about their respective customer journey.
This type of metric ends up being a proxy of a measure of similarity between an active client X, and the typical profile of churned/inactive clients. Another approach, more focused on answering the last two questions, would be to build an algorithm able to show what the commonest series of events in a customer journey that results in churn are. It is important to state that, most of the time, methods like this are not exclusively developed or used. This means that it might be useful to have more than one churn-related metric, like the two mentioned above, or others, specific to each company market, available data and organisation type.
The end goal of every ML learning methodology used within this context is to generate actionable knowledge. More specific, focused, with well-defined target populations, financially efficient and as effective as possible actionable knowledge. How this is rendered in action is outside the scope of this blog post, but there are a couple of simple ventures that it might make sense to mention, and that should not require any major structural company changes, such as targeted marketing campaigns or simple website/UI tests and updates. The effectiveness of these should be noticeable in the short to medium term in the overall customer churn rate. It is worth mentioning that the success of these kind of experiments should also be measured by following A/B testing.
Direct influence on customers though campaigns will be reflected in their behaviour and the overall CX. This, combined with macro and micro economic trends, business competitor strategies/activities and other extrinsic factors, will make the churn patterns detected non-static over time. In practical terms, this means that a pattern detected at a time, t, will be different from one detected at t+time. Consequently, an ML algorithm will need to be updated and maintained continuously, in order to remain relevant business-wise. This phenomenon is called concept drift and it is a well-known nemesis of the predictive analytics field.
Data Science at Xpand IT
The Data Science unit at Xpand IT developed a process, based on industry references and standards. This guide helps us to minimise the natural uncertainty of Data Science projects, by following a structured approach, based on agile methodologies.
A simple yet interesting exercise is to approximate the DS process to some of the use case key points made in this blog post:
- Viability Analysis:
- Define a business objective and a question to be answered, for instance: What is the archetypal series of events that results in customer churn?;
- Determine which churn-related metrics are appropriate according to the market, available data and company organisation;
- Study the impact of concept drift on the problem and how to subtract it from the solution;
- Where required, design A/B testing to evaluate the performance of the actions deployed.
- Build algorithms capable of detecting churn patterns.
- Automate model updates and retraining.
The use case discussed here is probably familiar to companies whose activities are based in retail consumption or client subscriptions. We hope that we have shown the reader a small glimpse of how ML and DS can add serious value to the problem at hand.
Our DS unit is ready to help you in this use case, in similar cases, such as Lifetime Value Prediction, or something completely different, such as Predictive Maintenance, Fraud Detection and many other scenarios.
Our goal is to deliver value throughout the project life cycle, while focusing on understanding your business and helping you to create and deploy the required technology.