From: Josh Berkus on
Dhiraj,

> For instance, if many users(above a threshold set by us) insert some
> search string for which no wanted search result is retrieved, we
> could track what he finally selects and then accordingly append/modify
> our set of phonetic rules based on the phonetic mismatch amongst the
> query inserted and result wanted according to our set of rules. Using
> this, the * rule sets it could evolve itself when we collect usage
> statistics from users based on their experience. * This feature would
> add a new dimension to the search functionality and would surely stand
> out.

You're mixing two completely different kinds of features here. One is a
backend function and the other is an application for building soundex
rules. While both of these are interesting projects, it is unlikely you
can complete both in one summer.

What I'd suggest focussing on for SoC is creating a new soundex funciton
(suggested name: soundex_ml) which includes a facility for loadable
algorithms and callability on a per-language basis. That would be
plenty of work by itself. From there, you could then continue your
undergraduate work on the tool to build the algorithms in the first place.

I'm also curious why you chose to focus on the extremely imprecise
soundex instead of the more discriminating metaphone.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers