In Unintended Consequences of Targeting – Part I, I discussed the unintended consequences of Targeting, in particular how it can decrease the information offered to users and the likelihood that they will find new products. I concluded that whenever users are recommended an item (book, movie, etc.) by some automated system or, alternatively, shown an ad in the search results page of a search engine, they would be best served if they were shown not only things which are an obvious extrapolation from their past habits but also things that are new or surprising. By providing more of the later the chances of discovery by serendipity increase. I turn now to a review of some of the approaches that have been advanced to increase these chances.
Increasing the Chances of Discovery by Serendipity
There is a growing literature on recommendation systems (and related topics) which has addressed the difficulty referred to above. The difficulty is presented as being about the provision of an adequate balance between the accuracy and the diversity of recommendations. Until recently, this literature suggests, research on recommendation systems has focused almost exclusively on accuracy, which led to systems that were likely to recommend only popular items, and hence suffered from a “popularity bias” (Celma and Herrera 2008). Nonetheless, despite the skepticism of some authors (e.g., Mooney and Roy 2000; Fleder and Hosanagar 2007), who argued that recommendation systems tend to reinforce the position of already popular products and thus reduce diversity, there has been (more or less successful) attempts to solve the problem of what might be called stickiness (or, more technically, path-dependence), for example, by Adomavicius and Kwon (2008; 2009), Celma and Herrera (2008), Zhang and Hurley (2008), Fleder and Hosanagar (2009), and Skoutas et al. (2010).
Adomavicius and Kwon (2008; 2009) argued that, to be useful, a recommendation system should provide the user with idiosyncratic items. The authors noted that “high accuracy may often be obtained by safely recommending to users the most popular items, which can lead to the reduction in diversity” (Adomavicius and Kwon 2008:1). To address this difficulty (i.e., the “reduction in diversity”) Adomavicius and Kwon (2008) suggested a neighborhood-based collaborative filtering approach, calculating the “variance of each predicted rating” from the “ratings of neighbors that were directly involved in the prediction of that rating”- previous attempts to calculating the variance of ratings often took all known ratings of an item into account (e.g., Adomavicius et al. 2007).
In later (and promising) work, Adomavicius and Kwon (2009) reiterated their view that, despite the widespread tendency to focus exclusively on accuracy, other aspects of recommendation quality are important, in particular the diversity of recommendations. The authors then introduced a number of “item re-ranking methods that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy.” The authors suggested that recommending the least popular items to an user would increase the diversity considerably, but it would also decrease the accuracy to (possibly) unacceptable levels. To solve this problems, they introduced a parametrized ranking approach, which allowed the user to choose a certain level of recommendation accuracy. Given any ranking function, rankx(i), and a ranking threshold, TR ∈ [TH, Tmax], the parametrized version of the function is:
The items are ranked according to rankx(i) if they are predicted to be above the ranking threshold, and according rankstandard(i) (a ranking function that ranks items in decreasing probability of relevance) if they are predicted to be below the threshold. Adomavicius and Kwon tested their parametrized approach using different ranking functions (e.g., ranking by how many users liked the item and by the percentage of the users who liked an item) and found that “consistently with the accuracy-diversity trade-off, all the proposed ranking approaches improved the diversity of recommendations by sacrificing the accuracy. However, with each ranking approach, as ranking the threshold TR increases, the accuracy loss is significantly minimized while still exhibiting substantial diversity improvement. Therefore, with different ranking thresholds, one can obtain different diversity gains for different levels of tolerable precision loss, as compared to the standard ranking approach” (2009:4).
Zhang and Hurley (2008:123) offered a number of objective functions that take into account the importance of introducing diversity into choice. The authors opened their article with the following insightful description of the difficulties related to recommendation systems: “The primary premise upon which top-N recommender systems operate is that similar users are likely to have similar tastes with regard to their product choices. For this reason, recommender algorithms depend deeply on similarity metrics to build the recommendation lists for end-users. However, it has been noted that the products offered on recommendation lists are often too similar to each other and attention has been paid toward the goal of improving diversity to avoid monotonous recommendations.” One approach to that was the one taken by Celma and Herrera (2008:180) who suggested that “there is a need in designing evaluation metrics to deal with the effectiveness of novel recommendations, not only measuring prediction accuracy, but taking into account other aspects such as usefulness and quality.” The authors developed “novelty metrics” which looked at how well a recommendation system made a user aware of “previously unknown items” and to what extent he accepted the new recommendations. Zhang and Hurley approached the goal of improving diversity from a different angle. Suggesting that the goals of recommending a set of items that are both diverse and and have high matching value (i.e., match the requirements of the user) stand in opposition to each other, they offer “objective functions that capture the trade-offs between these goals” and show that “the maximization of these objective functions can be represented as a binary quadratic programming problem.” They evaluated their method on a movies dataset and found that it increased the a recommendation system recommending novel items.
Likewise, Fleder and Hosanagar (2009) attempted to reconcile two seemingly incompatible views, namely, that recommender systems force users into niches and that they help users discover new products. They explored – analytically and through simulation – the extent to which diversity decreases because of path-dependent processes. The authors found, somewhat pessimistically, that “some popular recommenders can lead to a reduction in diversity. Because common recommenders recommend products based on sales or ratings, they cannot recommend products with limited historical data, even if they would be viewed favorably. These recommenders create a rich-get-richer effect for popular products and vice-versa for unpopular ones. Several popular recommenders explicitly discount popular items, in an effort to promote exploration. Even so, we show this step may not be enough to increase diversity” (Fleder and Hosanagar 2009:698). They suggested that “a recommender’s bias toward popular items can prevent what would otherwise be better consumer-product matches” and that “recommender designs that explicitly promote diversity may be more desirable.” In other words, recommendation systems that appropriately discount popularity may increase total sales.
The substantial reduction in the chances of discovery by serendipity is of obvious relevance to web search, and some work has been done in this area. Skoutas et al. (2010:1) noted that the “enormous value of the Web as an information repository lies basically on two constituents, being the amount and the diversity of the information it provides.” However, if the results of a query “are ranked in decreasing order of relevance, with each document being judged independently of other documents,” then diversity is lost sight of: the reason why is that the query often returns “documents with high similarity and overlap to each other” which, in turn, causes the user to “become saturated with redundant information, eventually abandoning the query.” And neither the search engine nor the advertiser would like that to happen! To solve the problem, the authors suggested that the top results should provide a good coverage of the whole set of relevant results. They still use a similarity measure between documents, but “instead of focusing on the similarity among only those documents in the selected subset of results, [they] are interested in the similarity of the non-selected documents to the selected ones” (Skoutas et al. 2010:2).
Conclusion
An increasing number of approaches has been offered to increase the diversity of the items recommended to users by recommendation systems. Some of these approaches focused on increasing the variance of the items using different measures (e.g., neighborhood-based), others on using a variety of procedures for ranking the results (e.g., parametrized function). These approaches have been more or less successful, but alas the user experience remains somewhat limited. One possible explanation is that despite the ingenuity of the aforementioned solutions, they fail to take into account the user experience, or rather, the user perception of what is new and important, and what they expect from these systems. Recommendation systems might provide better ways to let the user contribute his feedback. Indeed Amazon tries to get this kind of feedback in its “recommendations” page (they provide a “fix this recommendation” feature). Yet, it still fails to recommend truly novel items, one possible conjecture to explain why is that Amazon still relies too much on past online behavior. The pleasure of finding out something previously unknown, by chance or serendipity, is still denied users, and more work is needed.