Personalised product recommendations based on customer lifetime value and e-commerce product ratings

Valerie Lim
Analytics Vidhya
Published in
7 min readMar 8, 2020

--

Background

The explosive growth of online content and services has provided a myriad of choices for users. For instance, on e-commerce websites or apps, users are provided an overwhelming volume of products than the amount that users can digest. Imagine you have to search multiple pages on website A to find the item of interest but to no avail. As such, you go to website B and the same item is shown to you upon entering. Do you think the platform A lost just your order? No ─ it likely lost a dozen times more, possibly you as a customer lifetime value plus referrals. And platform B succeeded at acquiring it, possibly because it has a recommendation engine in place. Therefore, personalized recommendations are necessary to improve user experience and ensure retention. The corollary of pleasant and convenient user experience is augmented profits for businesses.

Before curating a set of products for different users, it is imperative to understand them better. Every customer is unique. Hence, if you treat them the same way with the same content, same channel, same importance, they will find another option which understands them better. Some of your users might make small purchases every week, others might make big purchases once a year — and there are all sorts of combinations in between. How can you possibly have foresight on what they might potentially purchase and recommend them accordingly?

Customer Segmentation

Customer Lifetime Value (CLV) takes some of the mystery out of knowing how your current and future customers will behave. CLV informs you how often certain types of customers will make purchases and when those same customers will stop making purchases for good. One way to quantify CLV is through the use of Recency-Frequency-Monetary Value (RFM) model.

  • Recency: A customer who has made a purchase recently is more likely to make a repeat purchase than a customer who hasn’t made a purchase in a long time.
  • Frequency: A customer who makes purchases often is more likely to continue to come back than a customer who rarely makes purchases.
  • Monetary Value: A customer who makes larger purchases is more likely to return than a customer who spends less

I applied k-means clustering to assign a recency score, and used the Elbow Method to determine the optimal number of clusters. This framework is used to derive a frequency score and monetary value score. An overall score is computed based on a sum of those three scores.

The mean of Recency, Frequency and Monetary Value score for various OverallScore

The scoring above shows us that customers with score 9 is our best customers whereas 0 is the worst. To keep things simple, three buckets of customers were derived — low, mid and high-value customers.

  • Low Value: Customers who are less active than others, not very frequent buyer and generates very low revenue (i.e. those with scores from 0 to 3).
  • Mid Value: In the middle of everything. Often using our platform (but not as much as our High Values), fairly frequent and generates moderate revenue (i.e. those with scores from 4 and 5).
  • High Value: High Revenue, Frequency and low Inactivity (i.e. those with scores 6 and above).

Recommender Systems

Collaborative filtering models were then developed for each customer. Collaborative Filtering (CF) is a technique that predicts the interest of a user by collecting taste or preference information from many other users. It assumes that if a user A and B like Product 1, A is more likely to have B’s preference on another product (i.e. user-based CF). Item-based CF and Matrix Factorization are other methods of CF. I used Matrix Factorization because

  1. it is the state-of-the-art solution for sparse data problem which is applicable in this dataset where most users rated a product category once only.
  2. it works by decomposing the user-item matrix into a product of user and item representations.
  3. it allows us to discover the latent (hidden) features underlying the interactions between users and items (in this case, I used product categories as there were no corresponding labels at product ID level). As such, less-known categories can have rich latent representations as much as popular categories have, which improves the recommender’s ability to recommend less-known categories.
  4. it circumvents the issue of having to know about relevant information pertaining to item content (e.g. item name), which is unavailable in this dataset.

I chose Alternating Least Square (ALS) implemented in Spark because it is a parallel algorithm designed for large-scale collaborative filtering problems. This method is doing a pretty good job at resolving scalability and sparseness of the user profiles, and it’s simple and scales well to very large datasets. Furthermore, it performed better compared to other matrix factorization methods such as SlopeOne, CoClustering, Non-Negative Matrix Factorization (NMF) and Singular Value Decomposition (SVD), as it had a lower Root Mean Squared Error (RMSE). RMSE of less than one means that the average error term between the predicted and actual rating is less than one.

Test RMSE for various Matrix Factorization methods

I also used Precision and recall at k to evaluate the ALS models. Precision at k is the proportion of recommended items in the top-k set that are relevant. Precision at 10 were consistently close to 100% for all three customers. This suggests that, if 10 products were recommended to the customer, almost all of them were bought. On the other hand, recall at 10 were consistently close to 88% across all three customers. This means that if a customer bought 10 products, our recommendation showed close to 9 of them.

Collaborative filtering method is appropriate for existing users, because we have prior knowledge on their CLV. However, we do not know how to quantify the CLV for new users. Hence, a popularity-based model that recommends popular products is created for new users. Popularity rating for a category is a function of its global average rating and the number of ratings received. The top five categories were furniture décor, telephone, health beauty, baby products and watches. As this popularity-based model is based on purchases made by existing customers, I can evaluate the model’s performance by comparing whether new users indeed bought the popular item categories.

Building a Flask app

As a minimum viable product, I have created a Flask app where existing users can enter their User ID, and different recommender systems are implemented based on their CLV.

Home page
Existing customers can select the product categories they are interested in and their user ID
Recommended categories based on User 64’s interest in furniture_decor

Future Work

  1. Precision and recall scores seem too good to be true. Instead of the current random 80–20 train-test split, I could order the observations in chronological order and use a blocking time series split and evaluate the model again.
  2. Develop rules to transit a new user from using a popularity based model to a Collaborative based system. After making a few purchases, I can infer his or her CLV and determine whether he or she is a high, mid or low value customer, and use the appropriate model to recommend accordingly.
  3. Explore whether a Collaborative based recommendation system using Autoencoder or reinforcement learning can lead to a better model performance. I could start with a simple Autoencoder with the user-product matrix as the input, and see how close the model can reproduce the matrix. After which, I could try a Deep Autoencoder that embeds more hidden layers that learn mathematically more complex underlying patterns in the data. This paper on reinforcement learning proposed a Deep Q-Learning based recommendation framework which can model future reward explicitly.
  4. To minimise user cold-start problem associated with a popularity-based recommendation system where personalization is not available since it is recommending items that are currently in trend, I could explore how Multi-Armed bandit algorithms explore/ exploit optimal recommendations for new users.

Thank you very much for reading :) If you enjoyed it, hit the applause button below, it would mean a lot to me and it would help others to see the story. Let me know what you think by reaching out on Linkedin.

Feel Free to check out:

Github Repository of this post

My Other Medium Posts

--

--

Valerie Lim
Analytics Vidhya

A fast learner and self-starter, Valerie is results driven and possesses strong analytical skills | Data Scientist @ Dell | linkedin.com/in/valerie-lim-yan-hui/