I’ve added two DebTags projects so far to the Debian Summer of Code page. Maybe we’ll add some more, I’m waiting for feedback.

The projects so far are:

DebTags database rewrite
the current central debtags database (the first "packagebrowser") has come of age, and is in desperate need of a rewrite and the addition of new features. It's still hosed on a woody system that should have been shut down years ago.
DebTags AI tagger
this will likely be a very popular project topic. I've already had two people expressing interest in doing this. The goal is to apply AI mechanisms (e.g. bayesian filters) to collected package information (description, readme, DOAP data, freshmeat info, sf/alioth/savannah/... info, documentation files maybe?) to guess which tags are appropriate

Maybe the second project will be split into two parts; one doing the main AI tagger work, the second working on a UI for editors to very the AI tags. Bringing that back to the central database, which will also need a review functionality.

The AI tagger project is very interesting, I wished I had had time for that myself. Benjamin already worked on it a bit last year. But there is so much you could do with that codebase…

Grab the top tags from DMoz or Del.icio.us and use them to train your AI. Then start classifying webpages with that.

I guess with some crawling work, you can extend the ~5 Mio links on DMoz to a reasonable directory with 20 Mio Entries. Now imagine a user interface like Debtags enhanced search by Enrico.

I really need to complete my studies and found a web2.0-company to sell to some search engine company for tons of money. ;-)