Affinity Analysis and Product Recommendations

Introduction

In a recent collaboration between JBS and Wine Access, we faced the challenge of finding a simple and reliable way to provide recommendations. Wine Access contacted JBS looking for a way to enhance its ‘top-offs’ recommendation system. They needed to generate wine recommendations based on user behavior - which wines were previously added to their baskets or orders.

Top-off Recommendations

Top-offs are wines that Wine Access recommends to users once they reach the free-shipping threshold, adding high-quality wines to their order without increasing the shipping costs. Wines are meticulously packed in cases specially designed for wine, keeping the wine at the perfect temperature and conditions during shipping, utilizing ice packs and sleeves as required. When customers qualify for free shipping and still have space in their order’s shipping containers, including the additional wines without additional shipping cost is a win-win for the customer and Wine Access.

Top-offs were a large source of revenue years ago, but the recommendation engine had been neglected for several months. The existing system required too much manual intervention and provided poor results, based on randomized recommendations limited only by price range. JBS developed a plan to generate an improved recommendation system, making top-off wines more relevant to the user and increasing sales and revenue for Wine Access.

Market Basket Analysis

After analyzing the existing data and considering available data analysis techniques (such as Regression, Neural Networks, etc.), we decided to implement a Market Basket Analysis - a particular application of Affinity Analysis for the purchase of products. Affinity Analysis is a “technique that discovers co-occurrence relationships among activities performed by (or recorded about) specific individuals or groups.” (Wikipedia)

In this case, the activity performed by the individual is the purchase of a particular wine. Market Basket Analysis applies a relatively simple mathematical/statistical model to cart and basket data. It has the great advantage of using clear, simple concepts that are easily communicated to different members of a team (development, marketing, etc).

Overview

Varietal Categories

The world of wine involves an overwhelming number of unique grape varieties and growing regions. In order to make the set of varieties manageable for data analysis, Wine Access developed Wine Varietal Categories. These categories were then intelligently grouped together, by similar combinations of grape variety and region, in order to reduce complexity. Popular Wine Varietals include Argentinian Malbec, Bordeaux Red Wine, French Red Wine, Napa Valley Chardonnay, and Sonoma County Cabernet Sauvignon, among others.

With an average of 115,000 orders per year in the last 3 years and over 10 years of order history, Wine Access can analyze thousands of historical records. We used Varietal Categories to assist in the process of data analysis, reducing the scope by providing a size-controlled set of data. For our analysis, we wanted to find associations between Varietal Categories. The analysis answers business questions like “if a customer is buying Argentinian Malbec wine, will this person be likely to purchase a Napa Valley Cabernet Sauvignon?”

Affinity Analysis

Our implementation is oriented towards providing meaningful top-off recommendations by finding positive relationships between different Wine Varietal Categories.

In order to better understand the methodology, it is valuable to review most important concepts of the data analysis.

Association Rule

  • A relationship between 2 items (wine varietal categories, in our case) composed of 2 elements: an antecedent and a consequent.
  • The association rule can be denoted by {X} → {Y}, where X is the antecedent, and Y is the consequent and means “customers who purchase X also purchase Y”.
  • The goal of the analysis is to find Association Rules that happen frequently enough, in order to find significant relationships among items.

Support

  • For a given association rule, the support is the relative frequency of the rule. That is, a measure of how many times we can see the rule in the data.
  • After analyzing the data, we decided to keep the support relatively small, to avoid skipping hidden (infrequent) but significant relationships.

Lift

  • Lift is the ratio of the observed support to that expected if the two rules were independent (see Wikipedia).
  • This is one the most important values we want to measure in the analysis: a value very close to 1 means that, for an association rule {X} → {Y}, the 2 items are basically independent of each other.
  • A value higher than 1 indicates a positive relationship, that is, item Y is more likely to be purchased by someone who has purchased X.
  • On the other hand, a value less than 1 indicates a negative relationship and item Y is less likely to be purchased by someone who has purchased X.
  • In our implementation, we wanted to find positive relationships among wine Varietal Categories and limited the results to the ones with lift values higher than 1.

Confidence

  • A measure of how likely the consequent is purchased when the antecedent is purchased.
  • A confidence value of 0.65 for the association rule {Napa Valley Cabernet Sauvignon} → {French Red Wine} would mean that 65% of the time, a customer who purchased a Napa Valley Cabernet Sauvignon wine also purchased a French Red Wine

Implementation

Our model was implemented in Python and incorporated into the Wine Access Django framework. We used the apyori library to implement the apriori algorithm. This generated the association rules used in the Affinity Analysis. Our data centered on the transactional history of purchases for Wine Access, transformed to focus on Varietal Categories in the transactions.

With Varietal Category associations available in the framework, we could then generate top-off recommendations in real-time during checkout, post-purchases, and in general wine views. We utilize the most meaningful associations in terms of lift, support, and confidence to optimize the top-off relevancy. The overall steps of the recommendation process are as follows.

  1. Given a set of wines in which we know the user is interested, find the set of Varietal Categories associated with these wines.
  2. Find the most significant Association Rules having as antecedent an element in the Varietal Categories of interest. Use these association rules to generate the set of consequents. For each of the Association Rule of the form {X} → {Y} that we are considering, X is a Varietal Category of interest and therefore we take the consequent Y.
  3. Use the set of consequent Varietal Categories to find available wines that belong to these categories and can be offered to the customer. This set of wines is the base for the recommendations.

In order to measure the impact of the our analysis on the top-off recommendation system, we implemented an A/B testing strategy. After 2 months of data collection, we saw a noticeable difference between the new recommendations and the control group. As a result, we decided to update the recommendation system to be based on the Affinity Analysis solution.

We also implemented a version of our analysis that considered the price-tiers of the wines in addition, but decided to keep using the Varietal Categories classification based on our testing results.

Results and Impact

Evaluation and implementation of the model spanned about 4 months. After 2 months with the updated top-off recommendations in place, there was a significant improvement in the performance of the top-off recommendations program.

  • The conversion rate for Wine Access Top-Offs increased from 2.88% (early 2020) to 5.11% (May 2020) using the Affinity Analysis.
  • The average revenue increased from $17306 at the beginning of 2020 to $42520 in April 2020 and $46636 in May 2020.
  • This translates to a forecasted yearly revenue increase of about $325000.

Based on the performance of the Top-Off recommendations, we also added recommendations to Wine description pages. This provides suggestions in the form,"If you liked this, you'll love these ones...". Evaluation of this latest change is still in process, but this effort represents an improvement to the cohesion of the site towards leveraging the market basket analysis results.

Overall, the market basket recommendation system has proven to be a viable, fast, and effective solution, derived from long-term transaction history and delivering increased revenue and sell through.