From: Arne Vajhøj on
On 09-07-2010 13:46, Lew wrote:
> Boris Punk wrote:
>> Is it not as simple as assigning int as 64 bit and long as 128 bit in newer
>> versions
>
> That depends. For whom would it be simple, and for whom not?
>
> How much legacy code are you willing to break?
>
> What about folks who suddenly experience performance degradation
> because their existing structures suddenly don't fit in on-chip cache
> any more, whose programs are suddenly twice as large, and for whom
> memory accesses overall suddenly take twice as long?
>
> Whether they need the extra capacity or not?

That type of optimizations would break anyway over time.

Code lives a lot longer than hardware.

It is not realistic to assume identical or almost identical
performance characteristics over the lifetime of code.

One of the reasons why micro optimization is rather pointless.

> Even today, as pointed out upthread, Java is perfectly capable of
> handling data structures that exceed 2 GB in size, just not in a
> single array or int-based structure (or in 32-bit mode). You have to
> go with multi-dimensional arrays or Collection<Collection> types of
> structures, but they will work.
>
> I predict that even when the need for multi-gigabyte arrays becomes
> widely acknowledged that the actual use cases for them will remain
> quite specialized. I venture to say further that the OP doesn't need
> one yet.

I agree with that one.

Such huge arrays seems mostly a scientific computing wish.

Arne

From: Arne Vajhøj on
On 09-07-2010 02:05, Mike Schilling wrote:
> "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message
> news:4c3655fd$0$283$14726298(a)news.sunsite.dk...
>> On 08-07-2010 18:22, Boris Punk wrote:
>>> Is there no BigList/BigHash in Java?
>>
>> No.
>>
>> But You can have a List<List<X>> which can then
>> store 4*10^18 X'es.
>>
>
> Or you could pretty easily build a class like
>
> public class BigArray<T>
> {
> T get(long index);
> void set(long index, T value);
> }
>
> backed by a two-dimensional array.[1] The reason I prefer to say "array"
> rather than "List" is that random access into a sparse List is a bit
> dicey, while arrays nicely let you set index 826727 even if you haven't
> touched any of the earlier indices yet, and will tell you that the entry
> at 6120584 is null, instead of throwing an exception.

Sure.

But the collections has the advantage of dynamic size.

I think the general trend is from arrays to collections.

With scientific computing as a major exception.

Arne
From: Arne Vajhøj on
On 09-07-2010 10:31, Patricia Shanahan wrote:
> On 7/9/2010 5:15 AM, Eric Sosman wrote:
>> On 7/8/2010 9:11 PM, Patricia Shanahan wrote:
>>> Arne Vajh�j wrote:
>>>> On 08-07-2010 17:35, Boris Punk wrote:
>>>>> Integer.MAX_VALUE = 2147483647
>>>>>
>>>>> I might need more items than that. I probably won't, but it's nice to
>>>>> have
>>>>> extensibility.
>>>>
>>>> It is a lot of data.
>>>>
>>>> I think you should assume YAGNI.
>>>
>>> Historically, each memory size has gone through a sequence of stages:
>>>
>>> 1. Nobody will ever need more than X bytes.
>>>
>>> 2. Some people do need to run multiple jobs that need a total of more
>>> than X bytes, but no one job could possibly need that much.
>>>
>>> 3. Some jobs do need more than X bytes, but no one data structure could
>>> possibly need that much.
>>>
>>> 4. Some data structures do need more than X bytes.
>>>
>>> Any particular reason to believe 32 bit addressing will stick at stage
>>> 3, and not follow the normal progression to stage 4?
>>
>> None. But Java's int isn't going to grow wider, nor will the
>> type of an array's .length suddenly become non-int; too much code
>> would break. When Java reaches the 31-bit wall, I doubt it will
>> find any convenient door; Java's descendants may pass through, but
>> I think Java will remain stuck on this side.
>>
>> In ten years, we'll all have jobs converting "legacy Java code"
>> to Sumatra.
>
> I don't think the future for Java is anywhere near as bleak as you paint
> it.
>
> The whole collections issue could be handled by creating a parallel
> hierarchy based on java.util.long_collections (or something similar for
> those who don't like separating words in package names). It would
> replicate the class names in the java.util hierarchy, but with long
> replacing int wherever necessary to remove the size limits. It could be
> implemented, using arrays of arrays where necessary, without any JVM
> changes.
>
> To migrate a program to the new collections one would first change the
> import statements to pick up the new packages, and then review all int
> declarations to see if they should be long. Many of the ones that need
> changing would show up as errors.

Collections is certainly solvable.

> Arrays are a worse problem, requiring JVM changes. The size field
> associated with an array would have to be long. There would also need to
> be a new "field" longLength. Attempts to use arrayRef.length for an
> array with more that Integer.MAX_VALUE elements would throw an
> exception. arrayRef.length would continue to work for small arrays for
> backwards compatibility.
>
> I suspect Eclipse would have "Source -> Long Structures" soon after the
> first release supporting this, and long before most programs would need
> to migrate.

It is not a perfect solution.

When calling a library some arrays would have to be marked
as @SmallArray to indicate that you can not call with a
big array, because the method calls length.

There may be other problems that I can not think of.

Arne

From: Arne Vajhøj on
On 09-07-2010 08:15, Eric Sosman wrote:
> On 7/8/2010 9:11 PM, Patricia Shanahan wrote:
>> Arne Vajh�j wrote:
>>> On 08-07-2010 17:35, Boris Punk wrote:
>>>> Integer.MAX_VALUE = 2147483647
>>>>
>>>> I might need more items than that. I probably won't, but it's nice to
>>>> have
>>>> extensibility.
>>>
>>> It is a lot of data.
>>>
>>> I think you should assume YAGNI.
>>
>>
>> Historically, each memory size has gone through a sequence of stages:
>>
>> 1. Nobody will ever need more than X bytes.
>>
>> 2. Some people do need to run multiple jobs that need a total of more
>> than X bytes, but no one job could possibly need that much.
>>
>> 3. Some jobs do need more than X bytes, but no one data structure could
>> possibly need that much.
>>
>> 4. Some data structures do need more than X bytes.
>>
>> Any particular reason to believe 32 bit addressing will stick at stage
>> 3, and not follow the normal progression to stage 4?
>
> None. But Java's int isn't going to grow wider, nor will the
> type of an array's .length suddenly become non-int; too much code
> would break. When Java reaches the 31-bit wall, I doubt it will
> find any convenient door; Java's descendants may pass through, but
> I think Java will remain stuck on this side.
>
> In ten years, we'll all have jobs converting "legacy Java code"
> to Sumatra.

If Java get 20 years as "it" and 20 years as "legacy", then
that would actually be more than OK.

Things evolve and sometimes it is better to start with a
blank sheet of paper.

64 bit array indexes, functions as first class type, bigint and
bigdecimal as language types etc..

Arne
From: Arne Vajhøj on
On 09-07-2010 19:56, Roedy Green wrote:
> On Fri, 09 Jul 2010 08:49:27 -0400, Eric Sosman
> <esosman(a)ieee-dot-org.invalid> wrote, quoted or indirectly quoted
> someone who said :
>>> Arrays can only be indexed by ints, not longs. Even if they were,
>>> even Bill Gates could not afford enough RAM for an array of bytes, one
>>> for each possible long.
>>
>> True, but not especially relevant: You'll hit the int limit long
>> before running out of dollars. $50US will buy more RAM than a Java
>> byte[] can use.
>
> I don't think our two posts conflict.

They don't if we assume that your post was irrelevant for the
thread.

Arne