Prev: The future of Java
Next: weird issue with new lines
From: Thomas Pornin on 23 Nov 2009 09:25 According to Lew <noone(a)lewscanon.com>: > Jon Harrop wrote: > > I believe the JVM is doing orders of magnitude more allocations than .NET > > here due to its type erasure approach to generics. > > Type erasure has no effect on your original problem. > > Type erasure has no effect on the number of allocations, let alone > "orders of magnitude". He wrote "type erasure approach to generics". He did not wrote that type erasure itself was responsible. Here, the extra allocations are due to the non-specialization of code at runtime: the container code (e.g. HashMap) has no knowledge of the contained element type. The root decision is for backward compatibility and mixing of pre-generics source code and compiled code with newer code within the same application. A consequence is that code cannot be specialized at runtime depending on the type parameters, which forces the Set or Map implementation to handle the float values as 'Object' instances, implying boxing (which are dynamic allocations, regardless of how you wish to name them). Another consequence of that decision of backward compatibility is the use of type erasure as a compilation gimmick to make generic types available at the syntax level. Naming this decision of backward compatibility the "type erasure approach to generics" is not illegitimate; syntax is what is seen and type erasure was duly advertised as the core technology which enabled use of generic types while maintaining backward compatibility. As far as terminology goes, "type erasure approach" is not bad here. Now type non-erasure does not automatically imply code specialization. It allows it. .NET uses code specialization (well, at least some implementations of the CLR do). This is what is shown here: in some corner cases, code specialization may provide a measurable performance boost. --Thomas Pornin
From: Jon Harrop on 23 Nov 2009 10:58 Patricia Shanahan wrote: > My reasoning is that you never reuse a key, so every put call creates a > new Entry instance. Note that an "Entry instance" is a value type on the CLR, something that the JVM is incapable of expressing. > Creating a Double from a double is about as simple > as object creation can be, Note that there is no object creation on the CLR and, indeed, I believe that is precisely why it is so much faster. > so I don't see how the boxing could to more > than triple the time spent in object creation during an average put > call. That cannot, by itself, account for a 32x performance ratio. I believe the difference is that the CLR does no boxing whatsoever (not of doubles and not of entries) whereas the JVM boxes everything. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Patricia Shanahan on 23 Nov 2009 09:45 Jon Harrop wrote: > Patricia Shanahan wrote: >> It might help to show your .NET program as well. If people ran both it >> and the Java version on a few different systems, we could see how much, >> if any, of the performance difference is configuration-specific. > > Sure, here's the F# code for .NET: > > let n = 10000000 > let m = System.Collections.Generic.Dictionary(n) > for i=1 to n do > m.[float i] <- 1.0 / float i > printf "%d %g\n" m.Count m.[100.0] > > That takes around 1s and the Java code takes around 32s here. > What is the size of an F# float? Patricia
From: Jon Harrop on 23 Nov 2009 11:00 Lew wrote: > Tom Anderson wrote: >>> I'd be *very* surprised if that was true. In this simple program, escape >>> analysis could eliminate the locking entirely - and current versions of >>> JDK 1.6 do escape analysis. > > The OP is not using a current version of Java 6. > Jon Harrop wrote: >>> $ java -version >>> java version "1.6.0_13" >>> Java(TM) SE Runtime Environment (build 1.6.0_13-b03) >>> Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode) > > According to > <http://java.sun.com/javase/6/webnotes/6u14.html> >> Optionally available are two new features - >> escape analysis and compressed object pointers. > > Which implies strongly that escape analysis, being "new" in 6u14, was not > available in 6u13. Even then, as Marcin Rzeźnicki wrote: >> ... escape analysis is turned off by default. > ... >> Well, I am more inclined to believe that his VM somehow did not >> perform lock optimization. Am I correct in thinking that escape analysis might result in unboxing of local values in functions but it will never result in unboxed data structures on the heap? For example, it cannot unbox the keys and values in a hash table? -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Patricia Shanahan on 23 Nov 2009 09:50
Jon Harrop wrote: > Martin Gregorie wrote: >> Did you try setting the initial capacity to 1000? > > I just tried setting it to the final size of the hash table (10M) and the > time actually got slightly worse. > >> I've had terrible performance from C programs that malloced for every one >> of a huge number of little items it wanted to put on the heap. I'd hope >> any JVM was better than that but you never know.... Besides, its quite >> possible NET allocates bigger chunks since Windows memory allocation was/ >> is quite slow. > > I believe the JVM is doing orders of magnitude more allocations than .NET > here due to its type erasure approach to generics. > I don't think type erasure or generics have anything to do with it. On a non-rehashing call, the JVM would do three allocations, one of which, for the Entry, would have been required regardless of the types. The other two are due the lack of primitive-based implementations of the java.util collections, causing two autobox conversions from double to Double. The .NET implementation may have a different way of storing the hash chains that avoids allocating an object for each put call. In addition, there is the issue of the number of rehashes. Each Java rehash call only involves one object allocation, but a lot of memory access and computation. That is, the issues I see as being candidates for the problem are the non-support of primitive collections, and other issues in the design of the F# and Java collections. Patricia |