Churn Rate - Why you should give Machine Learning a chance

Data Science, Machine Learning & Churn Rate

Monetising data, within clear privacy and compliance regulations, is a quintessential action for every company that wants to be, or keep being relevant in their market. Analysing data generated by every-day company activities might be one of the least expensive ways to probe for inefficiencies and check for performance issues. In fact, leveraging data and acting upon it should be a procedure with which companies are familiar if they want to keep ahead of the competition. All of this is, most likely, a given fact to anyone working with data. It might not be as clear to someone not working in the field, but a quick count of how many articles Forbes or Harvard Business Review produced on the topic of monetising data should be proof of how the concept is an acquired trend within the corporate landscape.

In this blog post, we will take a look at a specific use case of how data, combined with Data Science (DS) techniques and Machine Learning (ML) algorithms, can help a company management team better understand their business and their customers’ behaviour, by being equipped with more and better information in their decision-making process. As the reader may already have worked out by the title, the scope of this blog post will be customer churn and how data produced in-house can evaluate and detect it.

Let’s first define what costumer churn is, how it is measured, why it may occur and its impact in a company.

According to Investopedia, customer churn rate is the rate at which customers stop doing business with an entity. It is most commonly expressed as the percentage of service subscribers who discontinue their subscriptions within a given time period. Most obviously we can associate customer churn to a subscription-based model (SaaS), where the customer stops paying the recurring subscription. However, the concept can be also applied to one-off payment-based models too, i.e. when a regular customer stops purchasing from a particular shop.

One can elaborate on a number of obvious reasons for why clients churn: a drastic modification to a client’s financial situation, better competing products, poor costumer experience (CX) or unfulfilled client expectations. Taking the current pandemic into consideration, another easily identifiable cause may be the lack of online services during the lockdown which most countries experienced this year.

This said, it is obvious that a lower customer churn rate will probably benefit every party involved in a transaction: the company will see its profits grow and customer satisfaction will be higher. It is also well known how expensive it can be to get new clients, which is another strong reason to keep churn as low as possible.

Let’s consider a simple example. The company Xyz, which has a monthly subscription-based business model, has 5000 recurrent customers. Xyz considers a client to be recurrent if they conduct one consecutive monthly consecutive transaction. Over the past month, the company registered 125 canceled subscriptions, i.e. a 2.5% churn rate.

Looking into high churn rates, there are a few questions that management teams may want to answer. Some of the most obvious would be: What was the trigger event? and What is the typical churning client profile? The use of advanced analytics in this field produced interesting insights. As Bain & Company state, churn results from a series of episodes over time, not just one or two specific triggers. This conclusion makes us want to add one more obvious question: What are the specific root causes? or What is the archetypal series of events that result in customer churn?

Data, collected over time, such as revenue information, transactions, contract state and even demographic information for the client most likely have clues to the reasons behind a churn event. By analysing these data combined, ML algorithms provide an unbiased interpretation and may be able to distinguish behavioural patterns related to such events, undetectable to product owners and even people with savvy business acumen.

Machine Learning is a field of computer science that uses advanced mathematical models trained to identify patterns and predict events. These models learn to do such tasks according to the data they have seen.

Specific algorithmic approaches are able to translate these patterns into interpretable scores or insights, in order to answer relevant questions such as those mentioned above. For instance, it would be possible to compute the churn probability for every client based on data about their respective customer journey.

This type of metric ends up being a proxy of a measure of similarity between an active client X, and the typical profile of churned/inactive clients. Another approach, more focused on answering the last two questions, would be to build an algorithm able to show what the commonest series of events in a customer journey that results in churn are. It is important to state that, most of the time, methods like this are not exclusively developed or used. This means that it might be useful to have more than one churn-related metric, like the two mentioned above, or others, specific to each company market, available data and organisation type.

The end goal of every ML learning methodology used within this context is to generate actionable knowledge. More specific, focused, with well-defined target populations, financially efficient and as effective as possible actionable knowledge. How this is rendered in action is outside the scope of this blog post, but there are a couple of simple ventures that it might make sense to mention, and that should not require any major structural company changes, such as targeted marketing campaigns or simple website/UI tests and updates. The effectiveness of these should be noticeable in the short to medium term in the overall customer churn rate. It is worth mentioning that the success of these kind of experiments should also be measured by following A/B testing.

Direct influence on customers though campaigns will be reflected in their behaviour and the overall CX. This, combined with macro and micro economic trends, business competitor strategies/activities and other extrinsic factors, will make the churn patterns detected non-static over time. In practical terms, this means that a pattern detected at a time, t, will be different from one detected at t+time. Consequently, an ML algorithm will need to be updated and maintained continuously, in order to remain relevant business-wise. This phenomenon is called concept drift and it is a well-known nemesis of the predictive analytics field.

Data Science at Xpand IT

The Data Science unit at Xpand IT developed a process, based on industry references and standards. This guide helps us to minimise the natural uncertainty of Data Science projects, by following a structured approach, based on agile methodologies.

A simple yet interesting exercise is to approximate the DS process to some of the use case key points made in this blog post:

Viability Analysis:
- Define a business objective and a question to be answered, for instance: What is the archetypal series of events that results in customer churn?;
- Determine which churn-related metrics are appropriate according to the market, available data and company organisation;
- Study the impact of concept drift on the problem and how to subtract it from the solution;
- Where required, design A/B testing to evaluate the performance of the actions deployed.
Modelling
- Build algorithms capable of detecting churn patterns.
Deployment
- Automate model updates and retraining.

Small Conclusion

The use case discussed here is probably familiar to companies whose activities are based in retail consumption or client subscriptions. We hope that we have shown the reader a small glimpse of how ML and DS can add serious value to the problem at hand.

Our DS unit is ready to help you in this use case, in similar cases, such as Lifetime Value Prediction, or something completely different, such as Predictive Maintenance, Fraud Detection and many other scenarios.

Our goal is to deliver value throughout the project life cycle, while focusing on understanding your business and helping you to create and deploy the required technology.

Gonçalo Costa

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	Used by Google reCAPTCHA, which protects our site against spam enquiries on contact forms.
_icl_visitor_lang_js	1 day	Used by WPML WordPress plugin. The purpose of the cookie is to store the redirected language.
cli_user_preference	1 year	This cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
cookielawinfo-checkbox-[CATEGORY]	11 months	Used by GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the [CATEGORY] .
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
PHPSESSID	session	Used on native PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	Used by GDPR Cookie Consent plugin to store whether or not the user has consented to the use of cookies. It does not store any personal data.
viewed_cookies_policy	11 months	Used by GDPR Cookie Consent plugin to store whether or not the user has consented to the use of cookies. It does not store any personal data.
wpml_browser_redirect_test	session	Used by WPML WordPress plugin and is used to test if cookies are enabled on the browser.

Cookie	Duration	Description
__cf_bm	30 minutes	Used by Cloudflare, is used to support Cloudflare Bot Management.
_os_session	14 days	This cookie does not contain any user-specific information.
abgroups	1 month	Activates group A or B for the A/B feature functionality test.
brighsprout_auth_provider_session	2 hours	Brigh Sprout set's this cookie.
bscookie	2 years	Used by LinkedIn remembering that a logged in user is verified by two factor authentication.
CONSENT	2 years	Used by YouTube via embedded youtube-videos and registers anonymous statistical data.
cxssh_status	3 months 8 days	This cookie determines whether the browser accepts cookies.
lang	session	Used by LinkdIn to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings.
language	session	Used to store the language preference of the user.
li_gc	2 years	Used by Linkedin to store consent of guests regarding the use of cookies for non-essential purposes.
lidc	1 day	Used by LinkedIn to facilitate data center selection.
ln_or	1 day	Cookie used by LinkedIn.
VISITOR_INFO1_LIVE	5 months 27 days	Used by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
XSRF-TOKEN	2 hours	Wix set this cookie for security purposes and this cookie is written to help with site security in preventing Cross-Site Request Forgery attacks.
yt-remote-connected-devices	never	Used by YouTube to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	Used by YouTube to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
__adroll	1 year 1 month	This cookie is set by AdRoll to identify users across visits and devices. It is used by real-time bidding for advertisers to display relevant advertisements.
__adroll_fpc	1 year	AdRoll sets this cookie to target users with advertisements based on their browsing behaviour.
__adroll_shared	1 year 1 month	Adroll sets this cookie to collect information on users across different websites for relevant advertising.
__ar_v4	1 year	This cookie is set under the domain DoubleClick, to place ads that point to the website in Google search results and to track conversion rates for these ads.
__rd_experiment_version	session	This cookie tracks user behavior in RD's forms, aiding in the creation of analytical reports on them.
_clck	1 year	Microsoft Clarity sets this cookie to retain the browser's Clarity User ID and settings exclusive to that website. This guarantees that actions taken during subsequent visits to the same website will be linked to the same user ID.
_clsk	1 day	Microsoft Clarity sets this cookie to store and consolidate a user's pageviews into a single session recording.
_fbp	3 months	Used by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
_ga	2 years	Used by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_*	2 years	Used by Google Analytics to distinguish users.
_gat	1 minute	Used by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_gtag_UA_*	1 minute	Used by Google Analytics to distinguish users and to store a unique user ID.
_gat_UA-*	1 minute	Used by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Google Tag Manager sets the cookie to experiment advertisement efficiency of websites using their services.
_gd*	session	Used by Google Analytics to distinguish users
_gid	1 day	Used by Google Analytics registers a unique ID that is used to generate statistical data on how the visitor uses the website.
_hjAbsoluteSessionInProgress	30 minutes	Hotjar sets this cookie to detect a user's first pageview session, which is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	Hotjar sets this cookie to identify a new user’s first session. It stores the true/false value, indicating whether it was the first time Hotjar saw this user.
_hjIncludedInSessionSample_*	2 minutes	Hotjar sets this cookie to determine if a user is included in the data sampling defined by your site's daily session limit.
_hjRecordingEnabled	never	Hotjar sets this cookie when a Recording starts and is read when the recording module is initialized, to see if the user is already in a recording in a particular session.
_hjRecordingLastActivity	never	Hotjar sets this cookie when a user recording starts and when data is sent through the WebSocket.
_hjSession_*	30 minutes	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjSessionUser_*	1 year	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_te_	session	Adroll Group registers a unique ID that identifies a returning user's device. The ID is used for targeted ads.
319af4c0-e197-4de9-8a9b-fe98c8a2ca04	session	Dynamics 365 Marketing uses this cookie to group all page loads by a given visitor that are recorded by the same behavioral-analysis script and that occur within the configured timeframe. It will consider all of these as part of a single visit to the website.
79f08280-5c63-4331-b04d-fb6f39afda51	2 years	This cookie enables Dynamics 365 Marketing to score leads based on their level of interaction with a given website. The cookie contains no personal information, but does uniquely identify a specific browser on a specific machine, and Dynamics 365 Marketing can use it to correlate this ID with an actual contact in the Dynamics 365 Marketing database.
AnalyticsSyncHistory	1 month	Used by LinkedIn to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries
anj	3 months	AppNexus sets the anj cookie that contains data stating whether a cookie ID is synced with partners.
ANONCHK	10 minutes	The ANONCHK cookie, set by Bing, is used to store a user's session ID and verify ads' clicks on the Bing search engine. The cookie helps in reporting and personalization as well.
bcookie	2 years	Used by LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
browser_id	5 years	Used for identifying the visitor browser on re-visit to the website.
CLID	1 year	Used by Microsoft Clarity. The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.
CMID	1 year	Casale Media sets this cookie to collect information on user behaviour for targeted advertising.
CMPRO	3 months	CasaleMedia sets CMPRO cookie for anonymous usage tracking and targeted advertising.
CMPS	3 months	CasaleMedia sets CMPS cookie for anonymous user tracking based on users' website visits to display targeted ads.
fr	3 months	Used by Facebook to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies store information about how the user uses the website to present them with relevant ads according to the user profile.
KRTBCOOKIE_*	3 months	Pubmatic sets this cookie to register a unique ID that identifies the user's device during return visits across websites that use the same ad network.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
msd365mkttr	2 years	Microsoft Dynamic 365 collects information on user behaviour on multiple websites. This information is used in order to optimize the relevance of advertisement on the website.
msd365mkttrs	session	It allows the use of a specific form that sends the data filled in by the user to Microsoft Dynamic 365.
MUID	1 year	Identifies unique web browsers visiting Microsoft sites. These cookies are used for advertising, site analytics, and other operational purposes.
PugT	1 month	PubMatic sets this cookie to check when the cookies were updated on the browser in order to limit the number of calls to the server-side cookie store.
scribd_ubtc	10 years	Scribd sets this cookie to gather data on user behaviour across several websites and maximise the relevancy of the advertisements on the website.
SM	session	Microsoft Clarity cookie set this cookie for synchronizing the MUID across Microsoft domains.
SRM_B	1 year 24 days	Used by Microsoft Advertising as a unique ID for visitors.
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
UserMatchHistory	1 month	Used by LinkedIn for Ads ID syncing.
uuid2	3 months	The uuid2 cookie is set by AppNexus and records information that helps differentiate between devices and browsers. This information is used to pick out ads delivered by the platform and assess the ad performance and its attribute payment.
VISITOR_PRIVACY_METADATA	5 months 27 days	Cookie used by Youtube and used to track and enrich the users privacy settings on the Youtube platform.
vuid	2 years	Used by Vimeo to collect tracking information by setting a unique ID to embed videos to the website.
YSC	session	Used by Youtube to track the views of embedded videos on Youtube pages.
yt.innertube::nextId	never	Used by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	Used by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Search

Shortcodes Ultimate

Data Science

Churn Rate – Why you should give Machine Learning a chance

Data Science, Machine Learning & Churn Rate

Data Science at Xpand IT

Small Conclusion

Read more in

Search

Popular Posts

Tags

Portugal

Germany

United Kingdom

Sweden

Solutions

Centers of excellence

Technologies

Resources and News

Company

Data Science

Data Science, Machine Learning & Churn Rate

Data Science at Xpand IT

Small Conclusion

Share

Read more in

Guide for monitoring machine learning models

Machine Learning model monitoring: types of drift

Business Intelligence: how to define a governance strategy

Search

Popular Posts

Tags

Select your location

Portugal

Portuguese

Germany

German

United Kingdom

English

Sweden

English

Global

English