3 questions about arrays [Lisp]

Prev: Checking for declarations in macros
Next: scripting in CL -- please comment

From: Spiros Bousbouras on 12 Dec 2009 14:47

On 07 Dec 2009 10:22:07 -0800
tar(a)sevak.isi.edu (Thomas A. Russ) wrote:
> Spiros Bousbouras <spibou(a)gmail.com> writes:
>
> > On 04 Dec 2009 10:01:06 -0800
> > tar(a)sevak.isi.edu (Thomas A. Russ) wrote:
> > > Spiros Bousbouras <spibou(a)gmail.com> writes:
> > >
> > > > > > 2. Why the requirement that the dimensions of an array are fixnums ?
> > > > >
> > > > > Why not? This is not a big restriction, if an implementation wants
> > > > > bigger arrays, it only has to provide bigger fixnums.
> > > >
> > > > fixnums are supposed to be efficient , that's the criterion for
> > > > choosing the size of fixnums not the maximum size of arrays.
> > >
> > > Yes. And you want to make sure array access is efficient. Thus the
> > > limitation on only allowing fixnums as indicies
> >
> > No , you want to make sure that the programmer has the freedom to
> > choose the trade-off which is most appropriate for his application. If
> > he is willing to sacrifice some speed in order to gain larger arrays he
> > should have that choice.
>
> He does have that choice.
>
> He can always implement his own array storage system. See the sketch of
> such an implementation in another reply.
>
> But I would expect that an implementation that didn't use fixnum limits
> would have generally slow array accesses, and so it would tend to be
> avoided by must users. And thus such an implementation is unlikely to
> exist for long in the lisp ecosystem.

I on the other hand would expect that most users wouldn't even notice
the "slow" array accesses. After all , under default optimization
settings , aref will probably do more than simply fetch the value from
a certain address. That's "slow" but still perfectly adequate for many
applications.

[...]

> As Duane Rettig also points out, the standard array which holds any lisp
> object has storage sizes for machine-word size objects, since that is
> typically what is used for FIXNUM and POINTERS. Since, on a 32-bit
> architecture, there are 4 bytes per word, and addresses are typically
> measured in bytes, you can cleverly array to address all of virtual
> memory using fixnum indices for undifferentiated arrays.
>
> The only exception would seem to be bit vectors, where -- if the
> implementations supports it -- you can pack 8 elements into a single
> byte. [Note that the standard doesn't require this.]

It doesn't require it but I would expect a decent Lisp implementation
to work this way perhaps with declaration (optimize space). And it's
not just with bit vectors but with any object with size smaller than 1
byte.

[...]

> > > Besides, it would be ludicrous to expect that the fixnum size of a lisp
> > > implementation would depend on the actual amount of physical RAM
> > > installed in a computer, rather than the processor (and software)
> > > architecture.
> >
> > I agree. On the other hand it makes sense to expect that the maximum
> > array size would depend on the actual amount of physical RAM installed
> > in a computer so the standard is at fault for creating a connection
> > between fixnum size and array size.
>
> Why?
> Why physical RAM and not virtual address space?

Ok , virtual address space then.

> Typcially the fixnum size is tied to virtual address space instead, so
> you get what you want.
>
> And tying the array size to particular machine configurations makes it
> really hard to distribute your program, since you can't count on the
> maximum array size being anything in particular.

Since ARRAY-TOTAL-SIZE-LIMIT and related constants can be as large as
they want one cannot count on the maximum array size being anything in
particular anyway.

> And you don't have the option of running more slowly with less RAM. You
> wouldn't be able to run at all.

If an application requires a certain amount of RAM in order to do its
job then yes , it will not run on less RAM. But what does this have to
do with the discussion ?

> > > How do you address a multiple dimension array in RAM without
> > > multiplying?
> >
> > If all the dimensions are powers of 2 then you can do it by ORing and
> > shifting. Perhaps there are some nice tricks available in other cases
> > too.
>
> Well, yes, but they get harder when you have to do them with BIGNUMs.
> They are no longer single machine instructions.

But still simpler than multiplication.

[...]

> > > > 5 , so ? In fact if the point for the limitation in the size of arrays
> > > > is to never have a bignum as an array subscript then each dimension
> > > > could be up to (1+ most-positive-fixnum) so the definition of
> > > > ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> > > > possibility.
> > >
> > > Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
> > > arrays, including single-dimensional arrays that you want to displace to
> > > your multi-dimensional array.
> > >
> > > I also suspect that ROW-MAJOR-AREF also plays into this limit.
> >
> > As I've said to my reply to Tamas "the standard could allow
> > ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to be at most (1+
> > most-positive-fixnum) (which means that both should be allowed to be
> > bignums). This would still ensure that ARRAY-ROW-MAJOR-INDEX can work
> > with only fixnums while giving the programmer 1 extra subscript to work
> > with". Issue "ARRAY-DIMENSION-LIMIT-IMPLICATIONS" acknowledges this. It
> > says
>
> I think this is getting to the ridiculous hair-splitting point.

Why is it "ridiculous hair-splitting" ? When you say "Um, not
really..." above and then make a point which isn't valid I think it's
perfectly reasonable to point out that your point isn't valid.

> Are
> your requirements really so constrained that getting ONE or TWO
> additional array elements out of 500 million makes a difference?

Not at all. As I said in [1]

Not that 1 subscript matters much of course , my main beef is
with the decision to only allow fixnums. Having made that
decision I can kind of see why they opted for simplicity and
specified ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to
also be fixnums at the cost of taking 1 extra subscript away
from the programmer.

[1] <RylSm.7782$6p6.5066(a)newsfe05.ams2>
< http://groups.google.co.uk/group/comp.lang.lisp/msg/0268f9abf3bd8eca >

--
I want to hug you in a non-sexual way and tell you that you make my
heart burst of joy and cuddle up like a cute little cookie monster and
ask for more milk.
http://static.thepiratebay.org/legal/indianagregg_resp2.txt

From: Spiros Bousbouras on 12 Dec 2009 15:12

On Mon, 7 Dec 2009 21:00:37 +0000 (UTC)
Kaz Kylheku <kkylheku(a)gmail.com> wrote:
> On 2009-12-07, Tim Bradshaw <tfb(a)cley.com> wrote:
> > On 2009-12-07 18:22:07 +0000, tar(a)sevak.isi.edu (Thomas A. Russ) said:
> >
> >> Well, that's one option.
> >> There are others.
> >
> > That actually turns out to be the best option if you care about
> > performance for large bit-based structures as well even if you have an
> > implementation where the fixnum limit is not a problem, unless you have
> > a heroic compiler.
>
> I think that, barring some fantastically exceptional circumstances, the use of
> a 500 megabyte bit array indicates a computer science flunkie between the
> keyboard and chair.

[...]

> Anyway, on the theme about how to overcome the bit array + fixnum limitation:
> without switching to any bit packing tricks, what you can also do is this:
> Instead of using some least significant bits of the index as a bit offset into
> a word which contains packed bits, what you can do is use bits on the /opposite/
> end of the index to select which huge array to access. Then the least
> significant 30 bits (or whatever is the fixnum mantissa size) are used to index
> into that array. This way, you keep the bit arrays and don't have to write any
> code to pack bits, or worry about choosing the best cell size, etc.

I don't follow. In the solution you propose will each huge array use 1
byte to store 1 bit ? If no then I don't see how your solution avoids
bit packing. Or is your solution intended for situations where you
don't mind wasting memory ?

From: Spiros Bousbouras on 12 Dec 2009 16:23

On Tue, 8 Dec 2009 23:43:09 +0000 (UTC)
Kaz Kylheku <kkylheku(a)gmail.com> wrote:
> On 2009-12-08, Spiros Bousbouras <spibou(a)gmail.com> wrote:
> > On Wed, 09 Dec 2009 00:15:15 +0100
> > pjb(a)informatimago.com (Pascal J. Bourguignon) wrote:
> >>
> >> Yes, that's because the CLHS is written avoiding as much as possible
> >> redundancy. If it is written at one place that there is a limit on
> >> the size of arrays, then it is not repeated at another place that you
> >> shouldn't try to allocate bigger arrays.
> >
> > But where does it say whose responsibility it is to enforce the size
> > restriction ?
>
> Nowhere. So that means that the behavior is undefined.

Ok , I kind of see it. Still , I think it would be clearer if the
specification of ARRAY-TOTAL-SIZE-LIMIT for example said "This shall be
the upper exclusive bound on the array total size of an array".

> Similarly, whose responsibility is to enforce that literal lists can't be
> modified?

If the programmer wants to avoid undefined behavior then he must ensure
that his programme does not attempt to modify any literal objects. This
is clear to me because the page on quote special operator says "The
consequences are undefined if literal objects (including quoted
objects) are destructively modified". Although I have to confess I
can't see the significance of "destructively". Is it possible to modify
an object non-destructively ?

> Requesting a larger array than array-dimension-limit or array-total-size-limit
> is simply outside of the scope of the ANSI Common Lisp standard.
>
> Implementations can diagnose such an attempt with an error signal,
> honor the attempt, if possible, or just behave unpredictably:
> allow the program to become corrupted and produce incorrect
> results, or the entire Lisp image to crash, with or without
> some error diagnostic, etc.
>
> Ensuring that some of these really bad things don't happen is a quality of
> implementation issue, as is the provision of huge arrays that can be indexed by
> bignums.

--
Who's your mama ?

From: Rob Warnock on 12 Dec 2009 21:15

Spiros Bousbouras <spibou(a)gmail.com> wrote:
+---------------
| Kaz Kylheku <kkylheku(a)gmail.com> wrote:
| > Spiros Bousbouras <spibou(a)gmail.com> wrote:
| > > But where does it say whose responsibility it is to enforce the size
| > > restriction ?
| >
| > Nowhere. So that means that the behavior is undefined.
|
| Ok , I kind of see it. Still , I think it would be clearer if the
| specification of ARRAY-TOTAL-SIZE-LIMIT for example said "This shall be
| the upper exclusive bound on the array total size of an array".
+---------------

Uh... The CLHS already *DOES* say exactly that, quite explicitly, in fact!

http://www.lispworks.com/documentation/HyperSpec/Body/v_ar_tot.htm
...
Constant Variable ARRAY-TOTAL-SIZE-LIMIT
...
Description:
The upper exclusive bound on the "array total size" of an array.

I don't know how one could be much more clear than that.

Furthermore, ARRAY-TOTAL-SIZE-LIMIT is the *lowest* of all of the
possible reasons for limiting array size in a given implementation:

The actual limit on the array total size imposed by the implementation
might vary according the element type of the array; in this case,
the value of array-total-size-limit will be the smallest of these
possible limits.

-Rob

-----
Rob Warnock <rpw3(a)rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

From: Spiros Bousbouras on 13 Dec 2009 07:20

On Sat, 12 Dec 2009 20:15:52 -0600
rpw3(a)rpw3.org (Rob Warnock) wrote:
> Spiros Bousbouras <spibou(a)gmail.com> wrote:
> +---------------
> | Kaz Kylheku <kkylheku(a)gmail.com> wrote:
> | > Spiros Bousbouras <spibou(a)gmail.com> wrote:
> | > > But where does it say whose responsibility it is to enforce the size
> | > > restriction ?
> | >
> | > Nowhere. So that means that the behavior is undefined.
> |
> | Ok , I kind of see it. Still , I think it would be clearer if the
> | specification of ARRAY-TOTAL-SIZE-LIMIT for example said "This shall be
> | the upper exclusive bound on the array total size of an array".
> +---------------
>
> Uh... The CLHS already *DOES* say exactly that, quite explicitly, in fact!
>
> http://www.lispworks.com/documentation/HyperSpec/Body/v_ar_tot.htm
> ...
> Constant Variable ARRAY-TOTAL-SIZE-LIMIT
> ...
> Description:
> The upper exclusive bound on the "array total size" of an array.

So the phrase
"This shall be the upper exclusive bound on the array total size of an array"
is "exactly" the same as
``The upper exclusive bound on the "array total size" of an array''
is that right ?

One might argue that the meaning is the same but the first phrase makes
it clear to me that the bound is a constraint the violation of which
causes undefined behavior.

First | Prev | Next | Last
Pages: 8 9 10 11 12 13 14 15 16 17 18 19
Prev: Checking for declarations in macros
Next: scripting in CL -- please comment