A loyalty program is being restructured, and customers need to be selected for specific tiers and will receive targeted communication. Clients for the top tier could be selected by looking at the biggest spenders in the store, but there could be clients with only one big purchase that churned a long time ago. They could be selected by total number of invoices sent, but those could be small purchases that do not generate much revenue. How can we ponder all these factors and cluster these clients for an efficient activation campaign?
We can analyze and measure customer's behavior using an RFM analysis. RFM stands for Recency, Frequency and Monetary value that was spent by each customer, and we can use to create clusters of clients based on their spending habits. For an online store, we can define each of them as:
Recency: How many days ago was this client last purchase?
Frequency: How many times did this client buy from us?
Monetary value: What is this client lifetime value?
Usually, each of these 3 components will receive a value from 1 to 4, the higher the value, the better. For example, a client with F4 means that they buy with a very high frequency, whereas a F1 client only has 1 single purchase. This will generate 64 distinct clusters (4 x 4 x 4).
How we define the threshold of each segment is something that vary for each project that we work on. One good approach to it is to first see the distribution of the Recency, Frequency and Monetary value of our snapshot and create rules based on the business needs.
With our RFM calculated, we can now analyze our clusters. A client 4-4-4 is a top client for us. They have a very recent purchase, they have multiple transactions, and spends the most out of all clusters.
We can also identify clients that could be very valuable, but somehow we lost them. For example, a 1-4-4 is very similar from the 4-4-4, they spend a lot of money, purchased frequently, but for some reason haven't bought anything in a long time. This is a valuable cluster that can be targeted by the Marketing department with a specific communication to try and bring this client back.
We will need the transactional data grouped by client, we can do this either with an SQL or directly with Python. With this data in a Pandas dataframe it's a simple task to create our R, F and M values. R is the current date minus the client last purchase date, F is the total number of transactions, and M is the sum of those transactions.
To visualize our newly created clusters, we can use a data visualization library like Plotly to create a bar chart or a treemap of clients per cluster and see their distribution.
If you'd like to learn more about my projects or work together, feel free to reach out! You can also connect with me on LinkedIn