During a promotional event, a shopping mall offers customers an opportunity to win various prizes by submitting their purchase invoices. Typically, shopping malls lack access to transactional data since purchases occur directly at individual stores. However, with this promotional event, we have a unique opportunity to tap into this valuable transactional data. The challenge lies in determining how we can utilize this data to gain a comprehensive understanding of our customers' shopping journey throughout the campaign. Can we uncover patterns and correlations that will enable us to tailor future campaigns to customer behavior?
We can take inspiration from a similar problem of Market Basket Analysis. The classic example of this analysis is the unexpected correlation of sales of diapers and beer in a store, most likely due to young parents making some late night shopping. The store uses this discovery, places the two items together, and sales skyrocketed. (At least that how the legend goes).
Instead of looking at the correlation of items bought together in a store, we can take a step back, and look for the stores that a same client bought in both of them. Our approach was to uncover pairs of stores that customers frequently visit during a single shopping trip. By leveraging the Apriori algorithm and Association Rules, we determined the frequent pairs, adjusting our definition of 'frequent' to extract meaningful correlations.
The algorithm gives us a comprehensive table, providing a detailed account of the store pairs, including their individual and joint frequencies, but this dense data can be difficult to process for most non-tech stakeholders. To further enhance data accessibility, we visualized the results using a Chord diagram. This illustrative technique not only portrayed the frequent store pairs, but also offered additional insights into total store visits and average spend when hovering over each chord.
These data-driven insights enabled us to reassess the store mix within the mall to meet commercial needs more effectively. It also illuminated successful strategies for stimulating customer spending by identifying store pairs that encouraged upselling. The refined understanding of customer behavior equips us with the knowledge to customize future promotions and stimulate business growth.
Our first step involved transforming our dataset into a sparse matrix, with each row representing a client and each column corresponding to a store from our dataset. This conversion allowed us to utilize the Apriori algorithm effectively. The transformation was achieved by grouping data by clients, identifying every unique store they visited, and employing the 'get dummies' function to segregate the stores into separate columns.
We utilized the mlxtend library to implement the Apriori and Association Rules algorithms. With these tools, we could determine our minimum support, which established a 'frequent' pattern based on a pre-set percentage threshold. The result was a comprehensive table of frequent patterns, including metrics such as 'support' (the frequency of a pattern's occurrence) and 'confidence' (the probability that customers who shopped at one store also shopped at the other), among others.
To aid in understanding our results, we developed a chord diagram using the Chord library. This diagram depicted the frequency of paired store visits, with each chord representing a pair of stores. Further detail, such as the average purchase value when visiting both or just one of the stores, could be viewed by hovering over each chord. This approach facilitated the effective communication of our findings, enriching the insights drawn from the data.
You can also read more about this analysis in Earnings Release 4Q20 (page 14) at https://ri.multiplan.com.br/en/analysis-tools/results-center/
If you'd like to learn more about my projects or work together, feel free to reach out! You can also connect with me on LinkedIn