high dimensional index structures for arbitrary distance/similarity functions (text mining) [General Programming]

Prev: Question about threads which share a function
Next: Question about platform independent code and ANSI C compliantcode

From: csenges on 2 Jun 2010 15:38

Hi,
i do similarity queries (kNN) in text mining using feature vectors as
text representation. I'm new to this topic and wonder if there are any
standard index structures which people use to speed up similarity/
distance queries.

I know kNN in high dimensions is a research topic and there are
sopisticated algorithmns and data structures, but i just wanna know
what people use today.

kd- and r-tree do not work in with this number of dimensions. Maybe
local sensitive hashing (LSH) with a proper hashing function is an
option?

thx for any hints,
chris

|
Pages: 1
Prev: Question about threads which share a function
Next: Question about platform independent code and ANSI C compliantcode