3 questions about arrays [Lisp]

Prev: Checking for declarations in macros
Next: scripting in CL -- please comment

From: Tamas K Papp on 4 Dec 2009 10:54

On Fri, 04 Dec 2009 07:19:28 -0800, Pillsy wrote:

> On Dec 4, 3:57 am, Tamas K Papp <tkp...(a)gmail.com> wrote: [...]
>> I am curious: why are you worried about this? Have you ran into a
>> difficulty where, for example, you find fixnums on 64-bit SBCL
>> limiting?
>
> The fixnums on some popular implementations (like CLISP on my 32-bit XP
> machine) have a dramatically smaller range. 17 million element arrays
> really are too small for many applications.

Yes, but I am under the impression that CLISP is not the
implementation one would pick for computationally intensive
applications, so I see no problem with it making this particular
choice.

I like the CL standard because it leaves a lot of choice to the
implementor regarding these efficiency hacks, and at the same time the
programmer can rely on certain things being true. Eg if you are
writing some method which handles array indexes, you can always rely
on them being fixnums, regardless of the implementation. And then the
implementation is allowed to say what it considers a fixnum, and if
you don't like that, just pick another implementation.

I was wondering about the following: could there be an implementation
where all possible integers are fixnums? Then I encountered the
"Issue FIXNUM-NON-PORTABLE Writeup", which says that

"It is possible that an implementation have a single representation
for all integers, and no way to identify any efficient range of
integers. Those implementations might need to set
MOST-POSITIVE-FIXNUM and MOST-NEGATIVE-FIXNUM to arbitrary values,
consistent with the requirement that (SIGNED-BYTE 16) is a subtype
of FIXNUM."

So apparently there is no way to say that most-positive-fixnum is not
binding (ie it is something like infinity), but an implementation
could still do everything with arbitrarily large integers, and just
pick an exorbitant value for most-positive-fixnum, without the
partition having any consequence performance wise. However, this is
not a practical concern at the moment :-)

Tamas

From: Pascal J. Bourguignon on 4 Dec 2009 10:59

Pillsy <pillsbury(a)gmail.com> writes:

> On Dec 4, 3:57�am, Tamas K Papp <tkp...(a)gmail.com> wrote:
> [...]
>> I am curious: why are you worried about this? �Have you ran into a
>> difficulty where, for example, you find fixnums on 64-bit SBCL
>> limiting?
>
> The fixnums on some popular implementations (like CLISP on my 32-bit
> XP machine) have a dramatically smaller range. 17 million element
> arrays really are too small for many applications.

Don't complain:

Constant Variable ARRAY-TOTAL-SIZE-LIMIT

Constant Value:

A positive fixnum, the exact magnitude of which is
implementation-dependent, but which is not less than 1024.
^^^^

And since fixnums are at least (signed-byte 16), it's all that is needed.

--
__Pascal Bourguignon__

From: Thomas A. Russ on 4 Dec 2009 12:53

Kaz Kylheku <kkylheku(a)gmail.com> writes:

> >> for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
> >> exclusive bound on each individual dimension of an array". Why not make
> >> it an inclusive bound ?
....
> Why array-dimension-limit is an exclusive limit is probably for historic
> reasons.
>
> Maybe when Lisp was standardized, implementations did not agree whether or not
> this limit is inclusive or exclusive. If implementations don't agree about a
> parameter like this, what can you recommend to programmers? Treat it as
> exclusive, of course! That's the weaker, more portable assumption.
>
> Or maybe all of the implementations had the exclusive behavior, which
> would have been codified as-is.

Well, my guess would be related to considering how one often uses a loop
to access elements in an array. Typically, you would iterate starting
at 0 and ending when the index is >= the dimension.

So, by making the maximum dimension be max-positive-fixnum minus 1, you
can safely use fixnums to iterate through the indices without worrying
that the final increment will create a bignum.

This, like the limitation on using only fixnums an indicies is in the
spec to facilitiate efficient computation.

At least that is my take on this -- I wasn't there for any of the
standardization discussions.

--
Thomas A. Russ, USC/Information Sciences Institute

From: Thomas A. Russ on 4 Dec 2009 13:01

Spiros Bousbouras <spibou(a)gmail.com> writes:

> > > 2. Why the requirement that the dimensions of an array are fixnums ?
> >
> > Why not? This is not a big restriction, if an implementation wants
> > bigger arrays, it only has to provide bigger fixnums.
>
> fixnums are supposed to be efficient , that's the criterion for
> choosing the size of fixnums not the maximum size of arrays.

Yes. And you want to make sure array access is efficient. Thus the
limitation on only allowing fixnums as indicies

> On the
> other hand the maximum size of arrays should be what the operating
> system can handle even if it means that part of the array would be in
> swap memory.

Um, the question of fixnum sizes and the physical/virtual memory sizes
are completely orthogonal. As Pascal pointed out, most lisp systems can
generally index their entire address space (at a word level) using their
fixnums, so that means that fixnums would be sufficient to address all
of the memory. Even if a lot of systems use 29 rather than 30 bits for
fixnums, this still gives you most of memory, and some copying GCs need
can only safely use half of the virtual memory in order to guarantee
enough space to copy into....

Besides, it would be ludicrous to expect that the fixnum size of a lisp
implementation would depend on the actual amount of physical RAM
installed in a computer, rather than the processor (and software)
architecture.

> > Really, the reason is that Common Lisp is not purely a high level
> > programming language, it is specified in such a way that
> > implementation can easily be efficient. If you allowed bignums as
> > index for arrays, each AREF would imply a bignum multiplication, which
> > would be slow.
>
> Multiply what with what ?

How do you address a multiple dimension array in RAM without
multiplying?

> > Moreover, there would be no point, because in general
> > the size of arrays slots is one word, and fixnum is all you need to
> > index all the words you can store in memory. (eg. with a 32-bit
> > processor, a fixnum of 30-bit is enough to index the whole memory.
>
> Is it ?

Sure. Why do you think it isn't?

> 5 , so ? In fact if the point for the limitation in the size of arrays
> is to never have a bignum as an array subscript then each dimension
> could be up to (1+ most-positive-fixnum) so the definition of
> ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> possibility.

Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
arrays, including single-dimensional arrays that you want to displace to
your multi-dimensional array.

I also suspect that ROW-MAJOR-AREF also plays into this limit.

There is a lot of interaction between various parts of the specification
that you have to understand and consider when examining the design
decisions. It isn't enough to focus only on one small part without
considering how that impacts the language definition as a whole.

--
Thomas A. Russ, USC/Information Sciences Institute

From: Tim Bradshaw on 4 Dec 2009 14:43

On 2009-12-04 18:01:06 +0000, tar(a)sevak.isi.edu (Thomas A. Russ) said:

> Yes. And you want to make sure array access is efficient. Thus the
> limitation on only allowing fixnums as indicies

This is an important point. People's expectations of arrays are that
they will be fast. If array indexing involves consing bignums all over
the place that is not likely to be true. People who are using machines
with relatively small word sizes where (by some trick), a Lisp
implementation can map much more memory than the machine can address (I
suppose this might be the case on some architecture with long pointers
but small integers) should still be able to assume that array access is
quick. If they need much larger objects, they can use arrays of
arrays, for instance.

I can't imagine any case where the limitation of array indices to
fixnums is a limitation: are there any current systems where an
individual process can address more than a machine-word's worth of
space? (Note, not where the system as a whole can have more physical
memory than that - I think that was reasonably common on some later
32-bit systems wasn't it?)

--tim

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Checking for declarations in macros
Next: scripting in CL -- please comment