From: Arne Vajhøj on 10 Jun 2010 20:08 On 10-06-2010 17:13, Roedy Green wrote: > On Wed, 9 Jun 2010 21:00:08 -0700 (PDT), petek1976 > <pete.karousos(a)earthlink.net> wrote, quoted or indirectly quoted > someone who said : >> I have been having some trouble with the performance of Java. In my >> code I have narrowed it down to the sort: > > another optimisation is to use a native compiler. See Questionable. Today conversion to native is done to make decompilation harder not to improve performance. Arne
From: Arne Vajhøj on 10 Jun 2010 20:10 On 10-06-2010 16:51, Roedy Green wrote: > On Wed, 9 Jun 2010 21:00:08 -0700 (PDT), petek1976 > <pete.karousos(a)earthlink.net> wrote, quoted or indirectly quoted > someone who said : > >> final int vecSize = 1000000; >> ArrayList<Double> vec = new ArrayList<Double>(vecSize); > > This is a fair hunk of RAM to carve out of the heap. Not really. 4 MB / 8 MB is nothing today. Arne
From: Arne Vajhøj on 10 Jun 2010 20:24 On 10-06-2010 00:00, petek1976 wrote: > I have been having some trouble with the performance of Java. In my > code I have narrowed it down to the sort: > > For instance: > > > import java.util.*; > public class SortTest > { > public static void main(String args[]) > { > final int vecSize = 1000000; > ArrayList<Double> vec = new ArrayList<Double>(vecSize); > final int numItr = 10; > ArrayList<Double> times = new ArrayList<Double>(numItr); > for (int i=0; i<numItr; ++i) > { > for (int k=0; k<vecSize; ++k) > { > vec.add(k,Math.random()); > } > long startTime = System.nanoTime(); > java.util.Collections.sort(vec); > long endTime = System.nanoTime(); > times.add(i,(endTime-startTime)*1.e-9); > vec = new ArrayList<Double>(vecSize); > } > double avg=0.0; > for (Double val:times) > { > avg += val; > } > avg /= times.size(); > System.out.println("To sort " + vec.size() + " elements " + > numItr + " times took " + avg + " seconds on average"); > > } > } > > > > This prints about 0.58 seconds on average. How can I optimize this > reasonably? Note I am only timing the sort function. Using an array > instead of ArrayList is not an option. I tried Vector but it didn't > help. The equivalent C++ code (using vector) runs in about 0.37 > seconds when built with optimization (g++ version 4.2.1 using -03 > optimization ). I am running on a MacBook Pro 2.66 GHz Intel Core i7 > with 4 GB of ram. I also tried this example on Linux and saw no > difference. What is causing over 40% difference in speed? I must be > doing something wrong in my java code. > > > #include<iostream> > #include<vector> > #include<cstdlib> > #include<algorithm> > #include<sys/time.h> > #include<numeric> > using namespace std; > > template<class T> > void FillRand(T start, T end) > { > while (start != end) > { > *start++ = double(rand())/RAND_MAX; > } > } > > class TimeStamp > { > public: > void start() > { > gettimeofday(&st_val,0); > } > void stop() > { > gettimeofday(&end_val,0); > } > double elapsedSec() const > { > return end_val.tv_sec-st_val.tv_sec + > (end_val.tv_usec-st_val.tv_usec)*1.e-6; > } > private: > timeval st_val; > timeval end_val; > }; > > int main() > { > vector<double> vec(1000000); > const long num_itr = 10; > vector<double> time_stamps(num_itr); > TimeStamp ts; > for (long i=0; i<num_itr; ++i) > { > FillRand(vec.begin(),vec.end()); > ts.start(); > std::sort(vec.begin(),vec.end()); > ts.stop(); > time_stamps[i] = ts.elapsedSec(); > } > double avg=std::accumulate(time_stamps.begin(),time_stamps.end(), > 0.0)/time_stamps.size(); > cout<< "To sort "<< vec.size()<< " elements took "<< avg<< " > seconds on average\n"; > } It does not sound surprising to me. If we are using C terminology, then: - in Java you have an array of pointers to double - in C++ you have an array of double It sounds very plausible that this will cause a performance difference. Use double[] in Java. Or live with the performance. Arne
From: Thomas Pornin on 11 Jun 2010 04:45 According to Tom Anderson <twic(a)urchin.earth.li>: > In terms of the java vs C++ comparison: > > 1. Java's standard sort has to be stable, which means in practice it's a > mergesort, whereas STL's doesn't, so it can be a quicksort. Quicksort will > typically be faster than mergesort for random data. > > 2. Java handles the doubles as objects on the heap here, whereas C++, i > believe, will handle them as primitives (because it resolves templates at > compile time, essentially, whereas java just erases types and uses the > same code at runtime for all element types). Which means: > > (a) Java's data takes up more memory, and so requires more cache and > memory bandwidth to work with (you have a million objects, so you're > looking at 8 MB just for the raw numbers; the object overhead probably > about quadruples that). > > (b) Java has to do comparisons by making virtual (worse - interface!) > method calls, whereas C++ can just use a single machine instruction. Actually, for all those reasons plus a few others (Java array accesses are checked for boundaries; the JVM cannot optimize code as aggressively because it must work fast), really optimized C++ code ought to be widely faster than Java code. For that kind of work (no I/O, trivial memory allocation) a factor of 2 to 4 is typical. That Java achieves 60% of the C++ speed on this is impressive (it can happen, I have seen it, but not on code dominated by array accesses), or, more accurately, suspicious. I made a few measures on the machine I am currently using: an Intel Core2 Q8200, clocked at 2.33 GHz. GCC-4.3.2, JVM is Sun's 1.6.0_01 (an update is long past due, but that's not the point here). The JVM is launched with "-server". The CPU runs in 32-bit mode. A simplistic C code using qsort() sorts 1000000 random doubles in 0.37s. Java code which uses an ArrayList<Double> and Collections.sort() does the same job (after warm-up) in 0.83s. This looks more like what I would expect (a factor of 2.2). Please note that qsort() uses a function pointer for the comparison, and that function is not inlined in any way. If the OP has Java code which works in 0.58s but the C++ code needs 0.37s on the same machine, then either the C++ code, the C++ STL implementation, or the C++ compiler, is pathetic. The whole idea of templates with inlining is so that the C++ version is twice _faster_ than the old qsort() function, not twice _slower_. --Thomas Pornin
First
|
Prev
|
Pages: 1 2 3 4 5 6 Prev: NetBeans/StrutsTestCase - strange error message Next: encrypted javamail MimeMultipart |