From: petek1976 on
I have been having some trouble with the performance of Java. In my
code I have narrowed it down to the sort:

For instance:


import java.util.*;
public class SortTest
{
public static void main(String args[])
{
final int vecSize = 1000000;
ArrayList<Double> vec = new ArrayList<Double>(vecSize);
final int numItr = 10;
ArrayList<Double> times = new ArrayList<Double>(numItr);
for (int i=0; i<numItr; ++i)
{
for (int k=0; k<vecSize; ++k)
{
vec.add(k,Math.random());
}
long startTime = System.nanoTime();
java.util.Collections.sort(vec);
long endTime = System.nanoTime();
times.add(i,(endTime-startTime)*1.e-9);
vec = new ArrayList<Double>(vecSize);
}
double avg=0.0;
for (Double val:times)
{
avg += val;
}
avg /= times.size();
System.out.println("To sort " + vec.size() + " elements " +
numItr + " times took " + avg + " seconds on average");

}
}



This prints about 0.58 seconds on average. How can I optimize this
reasonably? Note I am only timing the sort function. Using an array
instead of ArrayList is not an option. I tried Vector but it didn't
help. The equivalent C++ code (using vector) runs in about 0.37
seconds when built with optimization (g++ version 4.2.1 using -03
optimization ). I am running on a MacBook Pro 2.66 GHz Intel Core i7
with 4 GB of ram. I also tried this example on Linux and saw no
difference. What is causing over 40% difference in speed? I must be
doing something wrong in my java code.


#include <iostream>
#include <vector>
#include <cstdlib>
#include <algorithm>
#include <sys/time.h>
#include <numeric>
using namespace std;

template <class T>
void FillRand(T start, T end)
{
while (start != end)
{
*start++ = double(rand())/RAND_MAX;
}
}

class TimeStamp
{
public:
void start()
{
gettimeofday(&st_val,0);
}
void stop()
{
gettimeofday(&end_val,0);
}
double elapsedSec() const
{
return end_val.tv_sec-st_val.tv_sec +
(end_val.tv_usec-st_val.tv_usec)*1.e-6;
}
private:
timeval st_val;
timeval end_val;
};

int main()
{
vector<double> vec(1000000);
const long num_itr = 10;
vector<double> time_stamps(num_itr);
TimeStamp ts;
for (long i=0; i<num_itr; ++i)
{
FillRand(vec.begin(),vec.end());
ts.start();
std::sort(vec.begin(),vec.end());
ts.stop();
time_stamps[i] = ts.elapsedSec();
}
double avg=std::accumulate(time_stamps.begin(),time_stamps.end(),
0.0)/time_stamps.size();
cout << "To sort " << vec.size() << " elements took " << avg << "
seconds on average\n";
}
From: markspace on
petek1976 wrote:

> This prints about 0.58 seconds on average. How can I optimize this
> reasonably? Note I am only timing the sort function. Using an array
> instead of ArrayList is not an option.


Unfortunately I think using an array of double is the only option. I
didn't see anything obviously wrong with your code.

I got about a 450% speed-up by switching to double[] instead of
ArrayList<Double>. Why is using double[] proscribed? If it's the need
for a List, you can roll your own.
From: Tom Anderson on
On Wed, 9 Jun 2010, petek1976 wrote:

> This prints about 0.58 seconds on average. How can I optimize this
> reasonably?Note I am only timing the sort function. Using an array
> instead of ArrayList is not an option. I tried Vector but it didn't
> help. The equivalent C++ code (using vector) runs in about 0.37
> seconds

In terms of the java vs C++ comparison:

1. Java's standard sort has to be stable, which means in practice it's a
mergesort, whereas STL's doesn't, so it can be a quicksort. Quicksort will
typically be faster than mergesort for random data.

2. Java handles the doubles as objects on the heap here, whereas C++, i
believe, will handle them as primitives (because it resolves templates at
compile time, essentially, whereas java just erases types and uses the
same code at runtime for all element types). Which means:

(a) Java's data takes up more memory, and so requires more cache and
memory bandwidth to work with (you have a million objects, so you're
looking at 8 MB just for the raw numbers; the object overhead probably
about quadruples that).

(b) Java has to do comparisons by making virtual (worse - interface!)
method calls, whereas C++ can just use a single machine instruction.
Java's runtime compiler stands a good chance of optimising that overhead
away; i have no idea if it will catch none, some, most, or all of it. You
could try dumping the compiled code to see what HotSpot is up to:

http://wikis.sun.com/display/HotSpotInternals/PrintAssembly

If performance is vital but you want a List interface, than as Mark
suggested, write a List<Double> implementation that wraps a double[]. You
can then use Arrays.sort to sort the content - for doubles, this is a
quicksort and uses direct comparisons, so it should be fast. AbstractList
makes this rather easy to implement. You could also try this guy's
DoubleArrayList:

http://pcj.sourceforge.net/

to which you could easily add internal sorting. It would be nice if there
was such a class in the Apache or Google collections libraries, but there
isn't.

tom

--
I'm angry, but not Milk and Cheese angry. -- Mike Froggatt
From: Jim Janney on
petek1976 <pete.karousos(a)earthlink.net> writes:

> I have been having some trouble with the performance of Java. In my
> code I have narrowed it down to the sort:
>
> For instance:
>
>
> import java.util.*;
> public class SortTest
> {
> public static void main(String args[])
> {
> final int vecSize = 1000000;
> ArrayList<Double> vec = new ArrayList<Double>(vecSize);
> final int numItr = 10;
> ArrayList<Double> times = new ArrayList<Double>(numItr);
> for (int i=0; i<numItr; ++i)
> {
> for (int k=0; k<vecSize; ++k)
> {
> vec.add(k,Math.random());
> }
> long startTime = System.nanoTime();
> java.util.Collections.sort(vec);
> long endTime = System.nanoTime();
> times.add(i,(endTime-startTime)*1.e-9);
> vec = new ArrayList<Double>(vecSize);
> }
> double avg=0.0;
> for (Double val:times)
> {
> avg += val;
> }
> avg /= times.size();
> System.out.println("To sort " + vec.size() + " elements " +
> numItr + " times took " + avg + " seconds on average");
>
> }
> }
>
>
>
> This prints about 0.58 seconds on average. How can I optimize this
> reasonably? Note I am only timing the sort function. Using an array
> instead of ArrayList is not an option. I tried Vector but it didn't
> help. The equivalent C++ code (using vector) runs in about 0.37
> seconds when built with optimization (g++ version 4.2.1 using -03
> optimization ). I am running on a MacBook Pro 2.66 GHz Intel Core i7
> with 4 GB of ram. I also tried this example on Linux and saw no
> difference. What is causing over 40% difference in speed? I must be
> doing something wrong in my java code.
>
>
> #include <iostream>
> #include <vector>
> #include <cstdlib>
> #include <algorithm>
> #include <sys/time.h>
> #include <numeric>
> using namespace std;
>
> template <class T>
> void FillRand(T start, T end)
> {
> while (start != end)
> {
> *start++ = double(rand())/RAND_MAX;
> }
> }
>
> class TimeStamp
> {
> public:
> void start()
> {
> gettimeofday(&st_val,0);
> }
> void stop()
> {
> gettimeofday(&end_val,0);
> }
> double elapsedSec() const
> {
> return end_val.tv_sec-st_val.tv_sec +
> (end_val.tv_usec-st_val.tv_usec)*1.e-6;
> }
> private:
> timeval st_val;
> timeval end_val;
> };
>
> int main()
> {
> vector<double> vec(1000000);
> const long num_itr = 10;
> vector<double> time_stamps(num_itr);
> TimeStamp ts;
> for (long i=0; i<num_itr; ++i)
> {
> FillRand(vec.begin(),vec.end());
> ts.start();
> std::sort(vec.begin(),vec.end());
> ts.stop();
> time_stamps[i] = ts.elapsedSec();
> }
> double avg=std::accumulate(time_stamps.begin(),time_stamps.end(),
> 0.0)/time_stamps.size();
> cout << "To sort " << vec.size() << " elements took " << avg << "
> seconds on average\n";
> }

Take a look at the source for Collections.sort:

public static <T extends Comparable<? super T>> void sort(List<T> list) {
Object[] a = list.toArray();
Arrays.sort(a);
ListIterator<T> i = list.listIterator();
for (int j=0; j<a.length; j++) {
i.next();
i.set((T)a[j]);
}
}

You're copying the list to an array, sorting the array, and then
copying the array back to the list. You might try coding up a simple
quicksort that operates directly on the ArrayList to see what kind of
times that gives you.

--
Jim Janney
From: Jim Janney on
Jim Janney <jjanney(a)shell.xmission.com> writes:

> You're copying the list to an array, sorting the array, and then
> copying the array back to the list. You might try coding up a simple
> quicksort that operates directly on the ArrayList to see what kind of
> times that gives you.

The evil approach would be to use reflection to get the elementData
field in ArrayList, and call Arrays.sort directly on that.

--
Jim Janney