Snapshot of List [Was: Re: sync on local variable] [Java Help]

Prev: using path hint in javax.jnlp's openFileDialog
Next: axis attachment apache temp directory solaris being deleted

From: Eric Sosman on 26 Mar 2010 10:22

On 3/25/2010 3:28 PM, Daniel Pitts wrote:
> On 3/25/2010 11:06 AM, Eric Sosman wrote:
>> [... about iterating over a changing List ...]
>> to keep the
>> iteration self-consistent you might want to do something like
>> lock ALL_ROWS, grab a snapshot with toArray(), unlock, and run
>> the iteration on the (private, stable) snapshot.
>
> I agree except, don't use toArray, use new ArrayList<Row>();

Not confrontational, just curious: Why prefer a new ArrayList
to an array? To me, it appears that an ArrayList is just an array
wrapped up in extra machinery, and I can't see that the machinery
adds any value for this usage. So, why pay the extra freight?
What am I missing?

--
Eric Sosman
esosman(a)ieee-dot-org.invalid

From: Daniel Pitts on 26 Mar 2010 16:15

On 3/26/2010 7:22 AM, Eric Sosman wrote:
> On 3/25/2010 3:28 PM, Daniel Pitts wrote:
>> On 3/25/2010 11:06 AM, Eric Sosman wrote:
> >> [... about iterating over a changing List ...]
>>> to keep the
>>> iteration self-consistent you might want to do something like
>>> lock ALL_ROWS, grab a snapshot with toArray(), unlock, and run
>>> the iteration on the (private, stable) snapshot.
> >
>> I agree except, don't use toArray, use new ArrayList<Row>();
>
> Not confrontational, just curious: Why prefer a new ArrayList
> to an array? To me, it appears that an ArrayList is just an array
> wrapped up in extra machinery, and I can't see that the machinery
> adds any value for this usage. So, why pay the extra freight?
> What am I missing?
Array is a relatively primitive concept. You're question is akin to
asking why use a "Date object" instead of an "int representing the
milliseconds since Jan 1, 1970".

Either one "works". Yes, the non-primitive has more "machinery", but
that isn't automatically a bad thing. The primitive looses semantic
meaning outside of its context, where a properly designed abstraction
maintains its semantics regardless of context.

Lists are also easier to work with, and work in more places.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

From: Arved Sandstrom on 26 Mar 2010 17:05

Daniel Pitts wrote:
> On 3/26/2010 7:22 AM, Eric Sosman wrote:
>> On 3/25/2010 3:28 PM, Daniel Pitts wrote:
>>> On 3/25/2010 11:06 AM, Eric Sosman wrote:
>> >> [... about iterating over a changing List ...]
>>>> to keep the
>>>> iteration self-consistent you might want to do something like
>>>> lock ALL_ROWS, grab a snapshot with toArray(), unlock, and run
>>>> the iteration on the (private, stable) snapshot.
>> >
>>> I agree except, don't use toArray, use new ArrayList<Row>();
>>
>> Not confrontational, just curious: Why prefer a new ArrayList
>> to an array? To me, it appears that an ArrayList is just an array
>> wrapped up in extra machinery, and I can't see that the machinery
>> adds any value for this usage. So, why pay the extra freight?
>> What am I missing?
> Array is a relatively primitive concept. You're question is akin to
> asking why use a "Date object" instead of an "int representing the
> milliseconds since Jan 1, 1970".
>
> Either one "works". Yes, the non-primitive has more "machinery", but
> that isn't automatically a bad thing. The primitive looses semantic
> meaning outside of its context, where a properly designed abstraction
> maintains its semantics regardless of context.
>
> Lists are also easier to work with, and work in more places.
>
I tend to use an array where the underlying structure I am representing
really is an array, and a collection where the underlying structure
really is a list or set or whatever.

Specifically I am not going to use an ArrayList, for example, if the
thing being modelled is not resizable. Why would I want to write code to
ensure that the ArrayList will stay at a fixed size when an aray will
already take care of that for me?

AHS

From: Eric Sosman on 26 Mar 2010 17:33

On 3/26/2010 4:15 PM, Daniel Pitts wrote:
> On 3/26/2010 7:22 AM, Eric Sosman wrote:
>> On 3/25/2010 3:28 PM, Daniel Pitts wrote:
>>> On 3/25/2010 11:06 AM, Eric Sosman wrote:
>> >> [... about iterating over a changing List ...]
>>>> to keep the
>>>> iteration self-consistent you might want to do something like
>>>> lock ALL_ROWS, grab a snapshot with toArray(), unlock, and run
>>>> the iteration on the (private, stable) snapshot.
>> >
>>> I agree except, don't use toArray, use new ArrayList<Row>();
>>
>> Not confrontational, just curious: Why prefer a new ArrayList
>> to an array? To me, it appears that an ArrayList is just an array
>> wrapped up in extra machinery, and I can't see that the machinery
>> adds any value for this usage. So, why pay the extra freight?
>> What am I missing?
> Array is a relatively primitive concept. You're question is akin to
> asking why use a "Date object" instead of an "int representing the
> milliseconds since Jan 1, 1970".

I think you've over-generalized my question. I'm not
asking whether arrays or Lists (longs or Dates) are preferable
in all circumstances, nor even in most circumstances. Rather,
I'm asking why you prefer the "heavier" object in the particular
situation Roedy faces.

> Either one "works". Yes, the non-primitive has more "machinery", but
> that isn't automatically a bad thing.

Okay. I'd also say it's not automatically a good thing,
especially when the machinery is not going to be used.

How much extra machinery are we talking about, in this specific
case? I'm suggesting a call to toArray() followed by an iteration
over the array: One new object created. A new ArrayList involves
creating that same array (perhaps twice, but I don't know whether
that happens here; there's a bug number I haven't looked up), creates
the ArrayList itself, and creates an Iterator when you actually do
the traversal. Three (four?) new objects instead of one.

How heavy are the extra objects? The ArrayList carries two
fields, plus another inherited from AbstractList. The Iterator
carries four more fields (you'll only see three in the source,
but inner classes carry a hidden reference to their owner).
During the iteration, the Iterator carefully checks whether the
ArrayList has been modified; we know it will not have been (the
reason we made the snapshot was so no modifications would disturb
us). Each item retrieved takes three bounds checks (nominally)
instead of one: One in hasNext(), one in next(), and one in the
actual array fetch. And so on.

Okay, okay, okay: Memory's cheap, CPU's are fast, only misers
count their change. But on the other hand, "Take care of the
pennies and the pounds will take care of themselves," and "Don't
use a cannon to kill a canary."

> The primitive looses semantic
> meaning outside of its context, where a properly designed abstraction
> maintains its semantics regardless of context.

The comparison is between an array and an ArrayList. The
only semantic difference I see between them is that the latter
can grow and shrink, while the former cannot. But in the case
at hand, the goal is to get a snapshot that will remain unchanged,
that is, to avoid growth and shrinkage. Again, it seems to me
we're paying for capabilities that will not be used.

> Lists are also easier to work with, and work in more places.

Once again, I return to the particular use in consideration.
The comparison is between

Row[] rows;
synchronized(ALL_ROWS) {
rows = ALL_ROWS.toArray(new Row[ALL_ROWS.size()]);
}
for (Row r : rows) { ... }

and

ArrayList<Row> rows;
synchronized(ALL_ROWS) {
rows = new ArrayList<Row>(ALL_ROWS);
}
for (Row r : rows) { ... }

There's only one "place" to consider, but still: Point to you
for being easier by five keystrokes. (With a simple change I
could get a fourteen-key swing and beat you by nine, but at the
cost of creating two arrays instead of one.)

Even though my first language was FORTRAN, I have no special
love for the array nor no special antipathy for the List. I'm
happy to use either. But I *do* have a preference for lighter-
weight gadgets when they're adequate for the purpose at hand, and
a Yankee's aversion to paying for unused extras. YM[*]MV.

[*] Motivation.

--
Eric Sosman
esosman(a)ieee-dot-org.invalid

From: Arne Vajhøj on 26 Mar 2010 20:59

On 26-03-2010 10:22, Eric Sosman wrote:
> On 3/25/2010 3:28 PM, Daniel Pitts wrote:
>> On 3/25/2010 11:06 AM, Eric Sosman wrote:
> >> [... about iterating over a changing List ...]
>>> to keep the
>>> iteration self-consistent you might want to do something like
>>> lock ALL_ROWS, grab a snapshot with toArray(), unlock, and run
>>> the iteration on the (private, stable) snapshot.
> >
>> I agree except, don't use toArray, use new ArrayList<Row>();
>
> Not confrontational, just curious: Why prefer a new ArrayList
> to an array? To me, it appears that an ArrayList is just an array
> wrapped up in extra machinery, and I can't see that the machinery
> adds any value for this usage. So, why pay the extra freight?
> What am I missing?

As a general rule:

know that more functionality than array will be needed in the future =>
pick ArrayList

know that more functionality than array will NOT be needed in the future
=> pick array

don't know if more functionality than array will be needed in the future
=> pick ArrayList

My assumption would be that in the real world that would be 10%-10%-80%.

Obviously you can argue that this seems fell in category #2.

Arne

|
Pages: 1
Prev: using path hint in javax.jnlp's openFileDialog
Next: axis attachment apache temp directory solaris being deleted