Expertise Analysis & Ranking with the SPEAR Algorithm

Michael G. Noll a CS Ph.D. candidate at Hasso Plattner Institute, Germany and Ching-man Au Yeung a PhD candidate in CC at the University of Southampton, UK have developed the SPEAR algorithm.
 

The SPEAR (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their public activities on platforms like Delicious.

From their post on the Delicious blog:

A major problem of the Internet today is that finding high quality information is not easy nor fast.  The steady increase of spam and junk content on the Web further complicates this challenge. Another related issue is that finding knowledgeable and trustworthy users on social platforms like Delicious is much more difficult than it should be. Wouldn’t it be nice if Delicious recommended “good” users with similar interests?


To tackle this problem, we created the SPEAR algorithm that analyzes the timeline of the bookmarking and tagging activities of users. The focus of SPEAR is on the ability of users to find new, high quality information on the Internet. A great benefit of SPEAR is that it returns two very useful sets of results: first, a list of users ranked by their expertise; and second, a list of websites ranked by their quality.


Technically, SPEAR is based on the well-known information retrieval algorithm HITS, a technique presented in 1999 that is used by search engines to rank Web pages. We came up with SPEAR by modifying HITS so that it fits to the characteristics of open and shared systems like Delicious and extended it with a new component that integrates the timeline of user activities into its analysis. This resulted in further performance improvements of the algorithm.


Spear_figure_1


The two main elements of the new SPEAR algorithm are:


1. Mutual reinforcement of user expertise and document quality: A user’s expertise in a particular topic depends on the quality of the documents she or he has found, and the quality of documents in turn depends on the expertise of the users who have found them.


2. Discoverers vs. followers: Expert users should be discoverers – they tend to be faster than others to identify new and high quality documents. In other words, “the early bird catches the worm”. SPEAR gives more credit to users the earlier they find high quality documents.


The combination of both these elements has the effect that SPEAR favors quality over quantity of user actions, and that the algorithm is quite resistant to today’s spamming attacks.


We believe SPEAR is very useful in the context of open systems, particularly, social networks. That said, we are already researching the next version of the algorithm – the popularity of online services like Delicious is rising, and so is the spam threat. Whether we want to improve the user experience on Delicious or win the arms race against spammers, there’s still a lot of work left to do!

Comments

  1. sounds interesting, but how can we try it out?

  2. I know that they propose that the level of expertise of a user with respect to a particular topic is mainly determined by two factors. Firstly, an expert should possess a high quality collection of resources, while the quality of a Web resource depends on the expertise of the users who have assigned tags to it. Secondly, an expert should be one who tends to identify interesting or useful resources before other users do.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s