Prev: The future of Java
Next: weird issue with new lines
From: Jon Harrop on 23 Nov 2009 15:20 Roedy Green wrote: > Could you elaborate a bit on what value types would look like if they > existed in Java and how you would use them for efficient hash tables? Sure. They're basically just structs from C or C++ and they make the set of primitive types (that don't get boxed) user extensible. They're used for various things including hash table entries (hash, key and value) and vertex data (texture coords, vertex coords, normal vectors) in a GPU-friendly format. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Jon Harrop on 23 Nov 2009 15:23 Roedy Green wrote: > On Mon, 23 Nov 2009 12:51:58 +0000, Jon Harrop <jon(a)ffconsultancy.com> > wrote, quoted or indirectly quoted someone who said : >>I believe the JVM is doing orders of magnitude more allocations than .NET >>here due to its type erasure approach to generics. > > I think you mean the extra overhead of boxing and unboxing, or perhaps > the extra overhead of allocating independent objects, rather that > plopping them into a single array the way you would with primitives. Yes. > You could find out the boxing/allocation overhead by preboxing your > elements, and saving the unboxing until you had waved the flag. Good idea! -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Marcin Rzeźnicki on 23 Nov 2009 14:11 On 23 Lis, 21:09, Jon Harrop <j...(a)ffconsultancy.com> wrote: > Marcin Rzeźnicki wrote: > > Oh yes, conclusions: > > Taking Jon's 32s of the execution time he could have saved around 3-4s > > had he preallocated HashMap. > > I already posted results using preallocation: it was no faster. > > > He actually did that in his F# so this > > modification alone might have caused F# version to run in, let's say, > > 28s. > > Are you speculating that F# would take 28x longer had I not preallocated? > > Measuring it, F# takes 1.24s if you don't preallocate and 0.945s if you do. > No, see my answer below > > He, of course, could not eliminate boxing which might have taken > > around 10s of his original execution time. So subtracting costs of > > boxing from implied theoretical F# version's execution time we end up > > with conclusion that F# should have executed in ~18s (which is > > erroneous proceeder in itself because F# probably copies values from > > stack). Roughly 1:2 in favor of F#. > > Eh? > I started from 32s of your Java program execution time which you'd reported and subtracted all costs that were suggested not to affect F# version (preallocation/boxing). This is quite inaccurate because of different algorithms, environments etc but it enabled me to compare what F#'s execution time could have been IF Java's was 32s as you'd observed. The result, as you have seen, is that even if F# did not pay any additional cost in execution time for value type copying etc it should execute 2 times faster than Java version. So the only logical conclusion is that 32 times faster is something not usual.
From: Jon Harrop on 23 Nov 2009 15:30 markspace wrote: > Marcin Rzeźnicki wrote: >> That could well be hidden in GC/heap resizing costs if he did not >> allocate Java heap properly. I prevented these effects mostly from >> occurring by running this example with -Xms512m -Xmx512m. > > I ran my test (18 seconds for Jon's code on a 32 bit laptop) This is a 2x quadcore 2.0GHz Intel Xeon E5405. What is you CPU? > with -Xmx800m. Which is why I keep saying that Jon should look at his JVM > flags before trying anything else. Doesn't seem to make any difference: $ time java Hashtbl hashtable(100.0) = 0.01 real 0m35.749s user 1m53.287s sys 0m3.208s $ time java Hashtbl -Xms512m -Xmx512m hashtable(100.0) = 0.01 real 0m36.336s user 2m5.896s sys 0m3.216s $ time java Hashtbl -Xmx800m hashtable(100.0) = 0.01 real 0m34.676s user 1m55.723s sys 0m3.332s $ time java -server -Xmx800m Hashtbl hashtable(100.0) = 0.01 real 0m37.198s user 1m55.967s sys 0m3.220s -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Jon Harrop on 23 Nov 2009 15:31
Patricia Shanahan wrote: > Marcin Rzeźnicki wrote: >> Now, the interesting part is >> 30.3% of time spent in java.lang.Double.valueOf(double) <--- that's >> boxing >> Furthermore, there were 2m + 1 calls to new Double meaning that no >> caching occurred. > > Interesting results. That is about what I would expect. If we were > trying to explain a 30% performance difference I would seriously > consider autoboxing as cause. If creating an Entry takes about as much > time as creating a Double, object creation could account for up to 45% > of the Java time. > > The problem is explaining a 32x performance difference. The cause or > causes have to account for over 96% of the Java time. I think the bottleneck must be the GC by a long long way. Look at this: $ time java Hashtbl -Xmx800m hashtable(100.0) = 0.01 real 0m34.676s user 1m55.723s sys 0m3.332s The CPU time taken is actually ~100x worse than for F#. Given that this is a serial benchmark, that parallelism could only have come from the GC. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u |