Prev: The future of Java
Next: weird issue with new lines
From: Jon Harrop on 21 Nov 2009 13:33 I'm having trouble getting Java's hash tables to run as fast as .NET's. Specifically, the following program is 32x slower than the equivalent on .NET: import java.util.Hashtable; public class Hashtbl { public static void main(String args[]){ Hashtable hashtable = new Hashtable(); for(int i=1; i<=10000000; ++i) { double x = i; hashtable.put(x, 1.0 / x); } System.out.println("hashtable(100.0) = " + hashtable.get(100.0)); } } My guess is that this is because the JVM is boxing every floating point number individually in the hash table due to type erasure whereas .NET creates a specialized data structure specifically for a float->float hash table with the floats unboxed. Consequently, the JVM is doing enormously numbers of allocations whereas .NET is not. Is that correct? -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
From: Tom Anderson on 21 Nov 2009 12:43 On Sat, 21 Nov 2009, Jon Harrop wrote: > I'm having trouble getting Java's hash tables I note you're using java.util.Hashtable. That's an ancient class which has a few things wrong with it: i'd strongly suggest using java.util.HashMap instead. I wouldn't expect it to be significantly faster in this test, though. > to run as fast as .NET's. [...] My guess is that this is because the JVM > is boxing every floating point number individually in the hash table due > to type erasure whereas .NET creates a specialized data structure > specifically for a float->float hash table with the floats unboxed. > Consequently, the JVM is doing enormously numbers of allocations whereas > .NET is not. > > Is that correct? I can't comment on what the CLR is doing in a comparable situation, but that is certainly a reasonable description of what java is doing. Could you try changing the put line to: hashtable.put(Double.toString(x), Double.toString(1.0 / x)); And making the corresponding change in the C# or whatever version, and making the comparison again? That eliminates boxing in java, so if the difference is due to boxing, it will be significantly reduced, which will give you some clues as to what's going on. If you need optimal performance for a double->double map in java, you would need to write one specifically for that case. Or rather, someone would: you could have a google to see if someone already has. Also, since converting doubles to longs is cheap and reversible, you could look for a long->long hashmap, which strikes me as more likely to exist. I didn't find one in two minutes of googling, but i did find some long->Object hashmaps, which get you halfway there. tom -- Oh, well of course *everything* looks bad if you remember it
From: John B. Matthews on 21 Nov 2009 12:54 In article <T9GdnWzZ_sPfvJXWnZ2dnUVZ7vadnZ2d(a)brightview.co.uk>, Jon Harrop <jon(a)ffconsultancy.com> wrote: > I'm having trouble getting Java's hash tables to run as fast as .NET's. > Specifically, the following program is 32x slower than the equivalent > on .NET: > > import java.util.Hashtable; > > public class Hashtbl { > public static void main(String args[]){ > Hashtable hashtable = new Hashtable(); > > for(int i=1; i<=10000000; ++i) { > double x = i; > hashtable.put(x, 1.0 / x); > } > > System.out.println("hashtable(100.0) = " + hashtable.get(100.0)); > } > } > > My guess is that this is because the JVM is boxing every floating point > number individually in the hash table due to type erasure whereas .NET > creates a specialized data structure specifically for a float->float hash > table with the floats unboxed. Consequently, the JVM is doing enormously > numbers of allocations whereas .NET is not. > > Is that correct? In addition to using HashMap, you might also specify an appropriate initialCapacity and loadFactor. You may be seeing a lot of rehash operations. <http://java.sun.com/javase/6/docs/api/java/util/HashMap.html> -- John B. Matthews trashgod at gmail dot com <http://sites.google.com/site/drjohnbmatthews>
From: Arne Vajhøj on 21 Nov 2009 13:17 Jon Harrop wrote: > I'm having trouble getting Java's hash tables to run as fast as .NET's. > Specifically, the following program is 32x slower than the equivalent > on .NET: > > import java.util.Hashtable; > > public class Hashtbl { > public static void main(String args[]){ > Hashtable hashtable = new Hashtable(); > > for(int i=1; i<=10000000; ++i) { > double x = i; > hashtable.put(x, 1.0 / x); > } > > System.out.println("hashtable(100.0) = " + hashtable.get(100.0)); > } > } > > My guess is that this is because the JVM is boxing every floating point > number individually in the hash table due to type erasure whereas .NET > creates a specialized data structure specifically for a float->float hash > table with the floats unboxed. Consequently, the JVM is doing enormously > numbers of allocations whereas .NET is not. > > Is that correct? Yes. And why do you ask when it is so easy to test and verify? Arne
From: Arne Vajhøj on 21 Nov 2009 13:19
Tom Anderson wrote: > Could you try changing the put line to: > > hashtable.put(Double.toString(x), Double.toString(1.0 / x)); > > And making the corresponding change in the C# or whatever version, and > making the comparison again? That eliminates boxing in java, so if the > difference is due to boxing, it will be significantly reduced, which > will give you some clues as to what's going on. I would still consider it boxing - just to String instead of Double. Arne |