From: RG on
In article <hvhuou$7eh$1(a)news.eternal-september.org>,
Tim Bradshaw <tfb(a)tfeb.org> wrote:

> On 2010-06-19 01:47:54 +0100, RG said:
>
> > No. While it is true that the spec is not explicit about this, it does
> > place certain constraints on hash tables. For example, CL hash tables
> > cannot under any circumstances assume that the keys have a total order,
> > so they cannot be implemented by any underlying data structure that
> > requires that the keys be ordered.
>
> Of course they can. The implementation would just have to change
> dynamically if it noticed that there was no longer a total order it
> could use.

If you're willing to seriously posit a system smart enough to do that
then there is no need for anything BUT "hash tables" (in scare quotes
now because they may or may not be implemented as hash tables). You
don't need, say, cons cells any more because the system would be smart
enough to figure out that a "hash table" with two keys "is" a cons cell
(particularly if those two keys happen to be the symbols CAR and CDR --
or is that FIRST and REST?) You don't need vectors because the system
could be smart enough to figure out that if the keys are a dense subset
of the integers (or, even better, that there exists a mapping from the
keys onto a dense subset of the integers) that the resulting "hash
table" "is" a vector. And so on.

It's not impossible, of course, but no existing CL implementation does
this, and I think it's a pretty safe bet that none ever will.

On the other hand, if you provide abstract associative maps as a
user-level abstraction then you have the flexibility to change the API
so that the *user* can specify (and more to the point, modify) the
underlying implementation without having to change client code. This
seems to me to be the more practical approach.

rg
From: Scott L. Burson on
On Jun 19, 10:13 am, RG <rNOSPA...(a)flownet.com> wrote:
> In article <hvhuou$7e...(a)news.eternal-september.org>,
>  Tim Bradshaw <t...(a)tfeb.org> wrote:
>
> > On 2010-06-19 01:47:54 +0100, RG said:
>
> > > No.  While it is true that the spec is not explicit about this, it does
> > > place certain constraints on hash tables.  For example, CL hash tables
> > > cannot under any circumstances assume that the keys have a total order,
> > > so they cannot be implemented by any underlying data structure that
> > > requires that the keys be ordered.
>
> > Of course they can.  The implementation would just have to change
> > dynamically if it noticed that there was no longer a total order it
> > could use.
>
> If you're willing to seriously posit a system smart enough to do that
> then there is no need for anything BUT "hash tables" (in scare quotes
> now because they may or may not be implemented as hash tables).  You
> don't need, say, cons cells any more because the system would be smart
> enough to figure out that a "hash table" with two keys "is" a cons cell
> [...]

Not necessarily. I think what FSet does fits Tim's description,
though it may not be exactly what he had in mind. FSet uses binary
trees and requires an ordering of elements, but it has a notion of
"ordering collision"; when two or more items in the same set collide
in the ordering, FSet falls back to putting those items in a list
rather than a tree. As long as collisions are relatively rare, the
performance impact is small.

Thus it is perfectly viable to use comparison of integer hash values
to generate the ordering. Performance is then dependent on the
quality of the hash function, just as it is for a hash table.

> On the other hand, if you provide abstract associative maps as a
> user-level abstraction then you have the flexibility to change the API
> so that the *user* can specify (and more to the point, modify) the
> underlying implementation without having to change client code.  This
> seems to me to be the more practical approach.

This part I agree with, of course. (Except that I think the word
"associative" is redundant; just call the thing a map.)

-- Scott
From: Tim Bradshaw on
On 2010-06-20 02:22:56 +0100, Scott L. Burson said:

> Not necessarily. I think what FSet does fits Tim's description,
> though it may not be exactly what he had in mind. FSet uses binary
> trees and requires an ordering of elements, but it has a notion of
> "ordering collision"; when two or more items in the same set collide
> in the ordering, FSet falls back to putting those items in a list
> rather than a tree. As long as collisions are relatively rare, the
> performance impact is small.

That's the kind of thing I was thinking of. It also seems to me that
systems can probably exploit ordering constraints that they know but
you can't: for instance an implementation might arrange life so that
the ordering of addresses of objects does not change, or changes only
very rarely, thus allowing it to impose a total ordering on things.
I'd certainly expect systems to implement "hashtables" with
increasingly sophisticated algorithms as things like locality become
more important.

From: Tim Bradshaw on
On 2010-06-19 15:42:50 +0100, Tamas K Papp said:

> OTOH, I don't deny the usefulness of the things you mention, I just
> don't think that they are that useful when one is first introduced to
> computers.

I think this is a bit like teaching mathematics. If you started off
teaching people about fields[*] you're going to have a catastrophe
(tragically, there is quite good data about this sort of thing,
following the "new mathematics" teaching movement in the 60s and 70s: I
don't think they started with fields, but they did do a lot of set
theory and group theory).

Of course, these concepts are terribly important - you can't get very
far in theoretical physics without a good grounding in group theory and
differential geometry - but that doesn't mean you should turn things
inside out, because you also can't get very far in theoretical physics
without the kind of aggressive numeracy which you only get by just
spending a lot of time playing with sums. And most people who aren't
theoretical physicists can get by just fine without group theory and
differential geometry, but would be helped enormously by being a lot
more numerate.

There's an obvious analogy to teaching programming here (though the
situation is worse there than it is with maths for various reasons).

--tim

[*] Don't pick me up on this: my memory is that a field is the right
model for reals under the normal operations, but I may be wrong - pick
the right model if so.

From: Rob Warnock on
Tim Bradshaw <tfb(a)tfeb.org> wrote:
+---------------
| Of course, these concepts are terribly important - you can't get very
| far in theoretical physics without a good grounding in group theory and
| differential geometry ...
+---------------

Heh, heh, you need a *lot* more these days than just those two,
as this excellent multi-part[1] cartoon points out: ;-}

http://abstrusegoose.com/272
Prerequisites


-Rob

[1] This is a 7-part cartoon -- you have to click the image
(*not* the "Next" button!) to advance to each part.

-----
Rob Warnock <rpw3(a)rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607