A super app is a mobile application that offers users multiple services in a single place, such as online shopping, social networking, send invoices to a loyalty program or even paying for a parking fee. However, even though there are many functionalities in the app, not all users take advantage of them, and sometimes they use only a single functionality. Can we identify the apps that are most used in sequence by a client? Are the underperforming combinations of apps that should have a higher shared user based that we can improve the user's flow from one app to another?
For this analysis, we can represent all this information as a directed graph, where each app is a node, and we can represent the edge connections as the percentage of users that used app 1 and then app 2 in the specified time window (could be using the second app up to N minutes after the first app, hours or even days).
The generated graph has two levels of information: node level and edge level.
The node size corresponds to the total amount of unique user for that app in the period we are observing, and if we hover the node, we can also see the exact value of users. The edge thickness corresponds to the percentage of users that are using the first app and then the second.
This allows us to quickly see which apps are the biggest drivers of traffic to other apps, and where we can improve our design and communication to improve underperforming connections.
Data containing an anonymized user id, app name and timestamp was extracted from a data lake using SQL. In Python, we create all the apps permutations and check how many users used app 1 before app 2 in the specified time period. This data is then represented as a Graph with Pyvis Network and graph visualization was made using NetworkX.
If you'd like to learn more about my projects or work together, feel free to reach out! You can also connect with me on LinkedIn