Distributed Machine Learning in Yahoo Sponsored Search

Sponsored search consists in retrieving the "best'' advertisements that match a given search query. The term "best'' in this context has to take into account several dimensions, such as relevance, revenue and post-click quality. These dimensions, sadly, are usually poorly correlated. Therefore, retrieving an ad in response to a query is a problem that is much more difficult than retrieving a web search result where relevance is a dominant dimension. In this talk we overview how machine learning is used at Yahoo to improve our sponsored search. The term "distributed'' in this talk has a double meaning. First, we show how distributed representations help to solve the problem of retrieving ads in response to a query by coupling relevance filtering and retrieving ads taking in account the content and the context involved with click prediction. Second, we overview a distributed architecture that allows to compute these distributed representations much faster. This talk includes the work of many people at Yahoo Labs and Yahoo Platforms.