2타워 임베딩 변형을 사용한 개인화된 레스토랑 순위

Towards Data Science | 2026년 3월 14일 00:34 | 🔬 연구

#review #개인화 #레스토랑 #순위 #임베딩 #추천

원문 출처: Towards Data Science · Genesis Park에서 요약 및 분석

요약

기존의 인기도 기반 순위 매기기 방식의 한계를 극복하기 위해, 가볍고 효율적인 투 타워(Two-Tower) 임베딩 모델을 활용한 맞춤형 음식점 추천 시스템을 소개합니다. 이 방식은 사용자의 개인화된 선호도를 반영하여 인기도가 낮더라도 잠재적인 관심도가 높은 음식점을 발견할 수 있도록 도와주며, 모델의 경량화를 통해 실시간 추천이 가능한 것을 핵심 특징으로 합니다.

본문

At that time this widget was significantly underperforming in comparison to other widgets on a discovery (main) screen. The final selection was ranked on general popularity without taking into account any personalized signals. What we discovered is that users are reluctant to scroll and if they don’t find something interesting within the first 10 to 12 positions then they usually do not convert. On the other hand the selections can be massive sometimes, in some cases up to 1500 restaurants. On top of that a single restaurant could be selected for different selections, which means that for example McDonald’s can be selected for both Burgers and Ice Cream, but it’s clear that its popularity is only valid for the first selection, but the general popularity sorting would put it on top in both selections. The product setup makes the problem even less friendly to static solutions such as general popularity sorting. These collections are dynamic and change frequently due to seasonal campaigns, operational needs, or new business initiatives. Because of that, training a dedicated model for each individual selection is not realistic. A useful recommender has to generalize to new tag-based collections from day one. Before moving to a two-tower-style solution, we tried simpler approaches such as localized popularity ranking at the city-district level and multi-armed bandits. In our case, neither delivered a measurable uplift over a general popularity sort. As a part of our research initiative we tried to adjust Uber’s TTE for our case. Two-Tower Embeddings Recap A two-tower model learns two encoders in parallel: one for the user side and one for the restaurant side. Each tower produces a vector in a shared latent space, and relevance is estimated from a similarity score, usually a dot product. The operational advantage is decoupling: restaurant embeddings can be precomputed offline, while the user embedding is generated online at request time. This makes the approach attractive for systems that need fast scoring and reusable representations. Uber’s write-up focused mainly on retrieval, but it also noted that the same architecture can serve as a final ranking layer when candidate generation is already handled elsewhere and latency must remain low. That second formulation was much closer to our use case. Our Approach We kept the two-tower structure but simplified the most resource-heavy parts. On the restaurant side, we did not fine-tune a language model inside the recommender. Instead, we reused a TinyBERT model that had already been fine-tuned for search in the app and treated it as a frozen semantic encoder. Its text embedding was combined with explicit restaurant features such as price, ratings, and recent performance signals, plus a small trainable restaurant ID embedding, and then projected into the final restaurant vector. This gave us semantic coverage without paying the full cost of end-to-end language-model training. For a POC or MVP, a small frozen sentence-transformer would be a reasonable starting point as well. We avoided learning a dedicated user-ID embedding and instead represented each user on the fly through their previous interactions. The user vector was built from averaged embeddings of restaurants the customer had ordered from (Uber’s post mentioned this source as well, but the authors do not specify how it was used), together with user and session features. We also used views without orders as a weak negative signal. That mattered when order history was sparse or irrelevant to the current selection. If the model could not clearly infer what the user liked, it still helped to know which restaurants had already been explored and rejected. The most important modeling choice was filtering that history by the tag of the current selection. Averaging the whole order history created too much noise. If a customer mostly ordered burgers and then opened an Ice Cream selection, a global average could pull the model toward burger places that happened to sell desserts rather than toward the strongest ice cream candidates. By filtering past interactions to matching tags before averaging, we made the user representation contextual instead of global. In practice, this was the difference between modeling long-term taste and modeling current intent. Finally, we trained the model at the session level and used multi-task learning. The same restaurant could be positive in one session and negative in another, depending on the user’s current intent. The ranking head predicted click, add-to-basket, and order jointly, with a simple funnel constraint: P(order) ≤ P(add-to-basket) ≤ P(click). This made the model less static and improved ranking quality compared with optimizing a single target in isolation. Offline validation was also stricter than a random split: evaluation used out-of-time data and users unseen during training, which made the setup closer to production behavior. Outcomes According to A/B tests the final syste

원문 보기 (Towards Data Science)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기