Lightweight representation learning for efficient and scalable recommendation
Lightweight representation learning for efficient and scalable recommendation
Olivier Koch Amine Benhalloum Guillaume Genthial Denis Kuzin Dmitry Parfenchik

Abstract
Over the past decades, recommendation has become a critical component of many online services such as media streaming and e-commerce. Recent advances in algorithms, evaluation methods and datasets have led to continuous improvements of the state-of-the-art. However, much work remains to be done to make these methods scale to the size of the internet. Online advertising offers a unique testbed for recommendation at scale. Every day, billions of users interact with millions of products in real-time. Systems addressing this scenario must work reliably at scale. We propose an efficient model (LED, for Lightweight Encoder-Decoder) reaching a new trade-off between complexity, scale and performance. Specifically, we show that combining large-scale matrix factorization with lightweight embedding fine-tuning unlocks state-of-the-art performance at scale. We further provide the detailed description of a system architecture and demonstrate its operation over two months at the scale of the internet. Our design allows serving billions of users across hundreds of millions of items in a few milliseconds using standard hardware.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| collaborative-filtering-on-movielens-20m | LED | Recall@20: 0.375 Recall@50: 0.516 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.