data mining

Text mining could save Digg

Many sites have already reported on the fact that the popular news site, Digg, is overwhelmingly controlled by a very small group of users. Furthermore, some users predict that unless Digg can again become a true interactive community, the site is done for, because it will become repetitive and untrustworthy; neither the word of the masses nor properly edited and authentic journalistic content.

Many of these complaints have one thing in common: They examine the large quantity of front-page (highly “dugg”) articles which are, in fact, duplicates of previously dugg articles from less well-connected users.

And so, I propose that what Digg could do to “save” itself (as though the wildly popular site truly needs a saviour) is to reduce duplication through textual analysis data mining. Or, less technically, by helping users find related, dugg, articles to the one they’re digging or reading.

Syndicate content