What Makes It Hot?
Michael Wu, Ph.D. is Lithium's Principal Scientist of Analytics, digging into the complex dynamics of social interaction and online communities.
He's a regular blogger on the Lithosphere and previously wrote in the Analytic Science blog.
You can follow him on Twitter at mich8elwu.
The Lithium Network Conference (LiNC2010) starts this week (in fact, tomorrow), so I am taking a little detour from our journey through social network analysis (SNA). I will use this opportunity to address a question that I often receive about our currently beta release of the Engagement Center (EC).
The engagement center (EC) provides a number of analytics dashboards that surface important analytics about your community as well as the conversations that is happen beyond the community. One of the new pieces of functionality that is provided in the EC is the hot topics (or hot threads) widget. This widget tells you which topic of conversation is hot in the community; moreover, it assigns a hotness score to each topic, allowing you to rank them.
The question that I often get is “What makes a topic hot and what is the hotness score?” Perhaps, you would say it is something that is new, popular and heavily in demand. These are all correct! In fact the hotness score is based on two components, which I will refer to as popularity and recentness. So a topic that is either recent or popular alone is insufficient to be hot. It must score high in both popularity and recentness in order to be a hot topic.
The popularity component of the hotness score consist of four factors.
- Rate of Participation: How fast are posts being accumulated in the thread
Note: The absolute number of post is not important, it is the speed (or velocity) at which the post count is increasing that matters.
- Rate of Consumption: How frequent is the thread being viewed
Note: Again, it is the speed (or velocity) at which page views increase rather than the absolute number of page view that matters.
- Total Number of Unique Participants
- Total Number of Kudos
The recentness component of the hotness score is a down-weighting factor that attenuates the popularity depending on how recent the thread is. An older thread will be attenuated more significantly than a younger thread. So the recentness component is implemented as an exponential-decay function (don’t worry if you don’t know what this is). In essence, it depends non-linearly on four factors.
- Time of Last Participation: This is the post time of the last message in the thread. This gives us a rough estimate of when people stop participating and this is when the attenuation begins. We don’t want to attenuate the popularity of a thread when people are still participating.
- Thread Lifespan: This is the length of time between the first post and the last post in the thread. A thread with longer life span suggested that it is probably more useful, so its attenuation rate should be slower (i.e. its popularity is suppressed at a slower rate). This allows the thread to be relatively hot for a longer period of time.
- Rate of Participation: This is the same factor we’ve seen in the popularity component. But here it is used to keep threads that have long lifespan but low participation rate in check (e.g. a thread that is posted a year ago but have no more posting activity until now). If we simply look at this thread’s lifespan (1 year), we would’ve come to the wrong conclusion that this thread must be extremely useful. So we must include the participation rate to keep the lifespan factor more honest.
- Solved Thread: Finally, threads that are solved are generally more useful, so their popularity will be attenuated at a slower rate.
There are two more important ingredients that go into computing the hotness score and determining which topics are hot.
C. Hotness Depends on the Intended Interaction
What’s hot may be different depending on what the intended interaction is. For example, forums are intended for discussion, so a hot thread in a forum should have a lot of messages. Although page views and kudos do help, posts are the most important measure of a hot thread. In contrary, blogs are meant to be read, so a hot blog article is one that has lots of page view, even if there is not much comment or kudos. Similarly, idea exchange is meant for voting, so kudos count is the most important measure of a hot idea. A heavily voted (kudoed) idea should be considered hot even if there aren’t very many comments.
So how we compute the hotness score is different for forums, blogs, and idea exchanges. With this logic in place the algorithm is now ready to compute the hotness score for all topics within a community.
D. What’s Hot Depends on Everyone Else’s Scores
After each topic is assigned a hotness score, we are still left with the problem of setting a proper threshold for selecting topics that are statistically hot. Intuitively, a hot topic should be one that stands out among the crowd. This is precisely what the algorithm does. It analyzes the hotness score for all topics (based on theory of Lorenz Curve) then selects those that are significantly hotter than the norm. Therefore, in a hypothetical community where every topic receives a very high hotness score, then there is really no hot topic, because no one topic really stands out.
This is a very stringent criterion for scoring and identifying hot topics. If this algorithm flags a topic hot, then it really is statistically significantly hot, and you should really pay attention to it. Don’t let a topic go viral beyond your community without noticing it.
Alright, I hope this blog entry gave you a better understanding of what makes a topic hot and how do we compute the hotness score. I'm sure all of you must have heard of the hot topic today. If you haven't, check it out here. You wouldn't want to miss any hot topic, would you? Next week I will resume our SNA journey if there aren’t any more questions from the LiNC2010. But if you do have any questions, please don’t be shy. Let me know at LiNC2010 or leave me a comment here.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.