There is not much time left to cast your vote for SxSW 2017. I’d like to ask for your help again to vote for my sessions. Both of them are related to data science.
If you find these topics interesting, please vote for my sessions NOW, because the Panel Picker voting will close Sept 2nd, 2016.
With that, let’s get back to the topic of personalization. As the Charles Dickens novel goes “It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness”. This reminds me of the beginning of personalization and the challenges of the early approaches.
Today, we live in a world where access to information is at our fingertip. In this world, relevance instead of raw impressions is more critical to the consumption of content, because an irrelevant impression is often ignored even if it’s right in front of us. Because of our selective attention, personalization becomes very important to the modern day consumers. It aims to create hyper-relevance (i.e. relevance to a single specific person), to minimize the chances of being ignored.
Personalization is all about delivering unique experiences to the consumers based on their individual preferences. It is a very broad concept that includes everything from the personalized email you get from LinkedIn about the career moves in your specific network to something more advanced like the personalized shopping experience in Minority Report. In the digital world, it really comes down to a recommender system, because it can tailor the digital content to a specific user’s interest.
From the discussion in my previous post, recommender systems can be viewed as a prescriptive analytics, because the underlying computation they are performing is an optimization of some objectives (e.g. similarity or relevance to a user’s interest). A recommender system will first score all digital content for their similarity to a particular user’s interest. The top contents that maximize the objective are those that are most similar to the user’s interest, so they will be recommended to the user to give him a unique experience.
There are many different approaches to building a recommender system. Each has its strengths and weaknesses. Today, I’m going to outline some of the early approaches and their challenges. Next time, I will dive deeper to discuss some modern approaches to address these challenges, and we will also examine novel approach we used to power our personalized community experience.
3 Early Approaches to Personalization
There are 3 common approaches to personalization in the industry. First, is the brute force approach. To get the user’s interests and preferences, we simply ask the users. There were many strategies and designs to simplify how a user may self-identify their interests and preferences, even using incentives and rewards. These attempts usually achieve limited success initially with small group of extremely motivated individuals, but fail to engage the bulk of the population. One of the biggest challenges with this approach is that the users must constantly update their interests as they change over time. Very few do this, even among those extremely motivated individuals.
In order to overcome the challenge of people’s changing interests, it was believed that we must take a more adaptive approach and leverage the power of machine learning. So the second common approach to personalization is learning from the user’s own past behaviors.
This is akin to having an imaginary friend, like Cortana (the humanoid AI in Halo), and she always goes out shopping with you. By watching what you browsed through and what you bought enough times, Cortana will be able to learn your interest and preference that way.
This approach is theoretically sound; we should be able to learn about a people’s interest and recommend similar content to what they have consumed in the past. In practice, however, things are always messier. Many learning algorithms require a lot of training data. This means it will take time to collect enough data for the learning algorithm to give an accurate inference about users’ preferences. Consequently, this approach typically does not work well initially (i.e. the cold start problem), but improves as we collect more data from the user. The only problem is that if it doesn’t work well at the beginning, many users could abandon the platform and stop using it altogether. So this approach will only work for the frequent users of the platform (typically a minority), where we can collect sufficient behavior data fast enough. It will not work for infrequent users, because the learning would be too slow.
Many techniques have been developed to speed up learning, so even the infrequent users can get a personalized experience. These techniques all involve some form of generalization or extrapolation to overcome the lack of data for a particular individual. In essence these techniques extrapolate from other people’s data and learn from other users’ collective behaviors.
This is a scenario where Cortana has never gone shopping with you or has not gone with you enough times to learn your interests yet. But if Cortina has gone shopping with many other people similar to you, then she can still infer your interest from other shoppers like you.
The highly popular collaborative filter (CF) leverages what it knows about your preferences (e.g. your book purchases, movie ratings, etc.) to find other users who are potentially similar to you; and then use their behavior (e.g. purchases, ratings etc.) on the new items to estimate your likely preferences. Since these are new items that you have not purchased or rated yet, there is no past preference data. But CFs use other people’s preference data on these new items to estimate your likely preference for these items based on others similarity to you.
Challenges of Traditional Collaborative Filters
Traditional CF does alleviate the cold start problem, but it’s doesn’t solve it. It still requires sufficient amount of preference data (e.g. purchase, ratings, etc.) on new items from other users in order for the CFs to predict your preferences on these items accurately. Moreover, CFs are built on the assumption that people with similar preferences in the past will continue to exhibit that similarity in the future. The validity of this assumption clearly depends on what data we use and how precisely we compute the similarities between individuals.
Learning from other users’ behaviors works well only if we can truly find others who are exactly like you. But we know that people’s interests and tastes are generally pretty unique if we have enough data on the users. The apparent similarity we see between users is an artifact of not having enough data on the user. For example, if we only look at the gender dimension, then it would appear that about half of the world’s population is exactly like me. If we consider other demographic dimensions (e.g. age, education, income, religion, ethnicity, city of residence, occupation, political orientation, etc.), then fewer and fewer people would match me in all those dimensions. In addition, if we further consider behavioral data, social data, and/or other rich sources of big data, then each of us will eventually become unique (i.e. a segment of one).
People only look similar because we don’t have enough data to tell them apart. Today, we have enough big data to uniquely distinguish each individual, so we should be able to give them a personalized experience that’s different from others. The challenges is learning from other users’ behavior will not work very well if everyone is unique, because no two of us are truly alike.
There are 3 common approaches to personalization. Each has its own problems.
Keep in mind that people only look similar because brands often do not have enough data to tell them apart. In reality, no two people in the world are exactly alike, but today we have enough big data to see customers in multiple dimensions and distinguish them individually. That is why personalization is so important for brands now.
Next time I will outline how the industry is addressing these challenges. We will also describe our novel approach to this problem.
Image Credit: mat's eye.
Michael Wu, Ph.D. is Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.