Michael Wu, Ph.D. is Lithium's Principal Scientist of Analytics, digging into the complex dynamics of social interaction and group behavior in online communities and social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics and its application to Social CRM.He's a regular blogger on the Lithosphere's Building Community blog and previously wrote in the Analytic Science blog. You can follow him on Twitter or Google+.
OK, let’s pick up where we left off. In my last post, we examined the first step in any big data processing engine – searching and filtering. In essence, the goal is to identify the relevant data from the irrelevant data (noise). If you’ve analyzed any form of big data, you probably noticed that the signal-to-noise ratio is pretty low. Most of the data are noise, and only a tiny fraction is the signal. The question then is, w