Building a multi-stage recommendation system (part 1.2)
Implementation of the two-tower model and its application to H&M data
This blog post is the follow-up on part 1.1 where we explained the two-stage recommendation process with a special emphasis on the candidate generation step. I encourage you to read this article first if you didn’t already. We described the two-tower model in depth and we are now going to implement it in TensorFlow 2 and apply it to a Kaggle dataset.
H&M Kaggle Competition
H&M issued the Personalized Fashion Recommendations challenge 3 months ago. The data consisted of 3 main sources:
- articles.csv which contains the metadata of the available articles for purchase
- customer.csv which contains the metadata of the customers
- transactions_train.csv which contains the transactions of the purchases made by customers during the last 2 years
Alongside this data, they also published the images of the articles which we are not going to exploit in this blog post.
The goal of the competition was to leverage the available data to implement a recommendation process and predict as well as possible the purchases of H&M online customers during the 7 days following the training data period. Competitors were evaluated on MAP@12 (Mean Average Precision at 12) but we are going to assess the value of our candidate generator using the…