From: Lew on
Wayne wrote:
> A reduction in the number of page faults.  There was an interesting article about
> this topic in this month's Communications of the ACM, by Poul-Jenning Kamp, who
> was one of the lead developers of the FreeBSD kernel.  He applied his insight
> to a web proxy replacement for Squid called Varnish, and was able to replace
> 12 Squid machines with 3 Varnish ones.  It used a modified binary heap he called
> a B-heap, which respected the page size of memory.  The article was titled
> "You're doing It Wrong".  The message I came away with was, don't ignore the
> fact that computers use paging when designing large data structures.  I was
> thinking that lesson might apply to the OP's situation.
>

How big is a page for a Java program?

Be sure to account for use on multiple architectures with and without
"-XX:LargePageSizeInBytes=??".

--
Lew
From: Lew on
Boris Punk wrote:
> Is it not as simple as assigning int as 64 bit and long as 128 bit in newer
> versions
>

That depends. For whom would it be simple, and for whom not?

How much legacy code are you willing to break?

What about folks who suddenly experience performance degradation
because their existing structures suddenly don't fit in on-chip cache
any more, whose programs are suddenly twice as large, and for whom
memory accesses overall suddenly take twice as long?

Whether they need the extra capacity or not?

Much safer would be the suggestions upthread of LongCollection classes
and perhaps a new "longarray" type.

There's no question that larger structures will be needed sometimes,
but the need will not be as widespread as some here appear to think.
The pressure to have more RAM on a machine, where 640K used to be
thought plenty and is now laughably microscopic, had nothing to do
with the size of int or much to do with the size of individual data
structures, but with the number of distinct structures and logic
contained within a program and the number of programs that must run
concurrently. Even with the clear need for gigabytes to terabytes of
RAM on machines today, pretty much none of that yet is for any
building-block data structure to exceed a gigabyte or two.

Even today, as pointed out upthread, Java is perfectly capable of
handling data structures that exceed 2 GB in size, just not in a
single array or int-based structure (or in 32-bit mode). You have to
go with multi-dimensional arrays or Collection<Collection> types of
structures, but they will work.

I predict that even when the need for multi-gigabyte arrays becomes
widely acknowledged that the actual use cases for them will remain
quite specialized. I venture to say further that the OP doesn't need
one yet.

--
Lew
From: Tom Anderson on
On Thu, 8 Jul 2010, Eric Sosman wrote:

> Or, you could have BigList implement List but "lie" in its .size()
> method, in somewhat the same way TreeSet "lies" about the Set contract.

How does TreeSet lie about the Set contract?

tom

--
The world belongs to the mathematics and engineering. The world is as
it is. -- Luis Filipe Silva vs Babelfish
From: Patricia Shanahan on
On 7/9/2010 12:45 PM, Tom Anderson wrote:
> On Thu, 8 Jul 2010, Eric Sosman wrote:
>
>> Or, you could have BigList implement List but "lie" in its .size()
>> method, in somewhat the same way TreeSet "lies" about the Set contract.
>
> How does TreeSet lie about the Set contract?

The case I'm aware of involves a TreeSet with a Comparator, that is not
consistent with the .equals methods of the TreeSet elements. The TreeSet
always goes by the Comparator results. That means the TreeSet could
contain elements a and b such that a.equals(b).

Patricia
From: Roedy Green on
On Thu, 8 Jul 2010 23:22:03 +0100, "Boris Punk" <khgfhf(a)hmjggg.com>
wrote, quoted or indirectly quoted someone who said :

>
>Is there no BigList/BigHash in Java?

The tricky thing is, you have no giant arrays to use in
implementation.

If you want BigHash, I could write you one for a fee. It would use
arrays of arrays for addressing inside.
--
Roedy Green Canadian Mind Products
http://mindprod.com

You encapsulate not just to save typing, but more importantly, to make it easy and safe to change the code later, since you then need change the logic in only one place. Without it, you might fail to change the logic in all the places it occurs.