Prev: A little help needed
Next: ANN: Sequel 3.10.0 Released
From: Robert Klemme on 4 Apr 2010 15:17 On 04.04.2010 20:21, Caleb Clausen wrote: > On 4/4/10, Robert Klemme <shortcutter(a)googlemail.com> wrote: >> On 04/03/2010 03:07 AM, Caleb Clausen wrote: >>> Oh, I wish for a real integer set class! >> Well, there's Fixnum / Bignum: > > Well, yeah, and that's fine for storing sets of small integers. Or I > could put my Integers into a Set, and that's fine for small sets of > integers. I have used both techniques in the past. What I have in mind > is more scalable: a way to store large sets of large integers. > Efficiently. So, what I would like to see is a data structure with > O(1) (or at most O(log(n))) insertion, deletion, and lookup times, as > well as (as close as possible to) O(1) memory cost per stored item. > Contiguous ranges of integers should be storable with not much more > memory cost than that of a single integer. > > Bitmaps come close, but also cost O(1) bits per integer (< the largest > integer stored) which _isn't_ in the set. A hash-based set costs > something like O(2*wordsize) bits per stored integer. I believe, it's more since you explicitly stated you want large numbers (=> Bignum). > Perhaps this can > be reduced to O(2+wordsize) if using google's sparse array/sparse > hash. (Which Roger has been kind enough to add a ruby interfaces > for.... but not an integer set.) > > Of course there's a contradiction here; if: > 1) the integer set is fairly dense, and > 2) members are randomly distributed throughout it, > then Kolmogorov complexity theory says you can't get any better > storage efficiency than a bitmap. So, this miraculous data structure > I'm imagining has to be allowed to discard one or the other of those > assumptions. Discarding 1) leads to something like google's sparse > array (sparse bitmap really; I'm not sure if google's sparse array > code could be used as is). Discarding 2) leads you in the direction of > a tree or trie. That's what I'm more interested in. Maybe something > based on Judy would suit my purposes.... > > I've wanted this several times, but never really needed it badly > enough to go and implement what I want to see. (As it almost surely > involves writing a fair bit of c code.) Take that as a hint. :-) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ |