From: aminer on

jarto wrote:
>Here's an updated version of my test program. In this one you
>can specify the number of names as a command line argument.
>It also saves the original names and results as text files as long
>as number of names does not exceed 100000.

>http://www.starsoft.fi/jarto/ParallelSortTest2.zip

>With my Dell Latitute E5500 with 2,40GHz P8600 I got this kind of results:

>Cores: 2
>4000000 names sorted in 00:13:360
>Cores: 1
>4000000 names sorted in 00:24:000

>But the biggest concern was, that the result files did not always match.
>One or a few names were not sorted properly in some of the test runs.

http://www.lazarus.freepascal.org/index.php/topic,9359.0.html


As i have noticed , you are using this:

Intel® Pentium® Processor E5500
(2M Cache, 2.80 GHz, 800 MHz FSB)

http://ark.intel.com/Product.aspx?id=42800

It has only *1* L2 cache of 2M.


I have used CPU-Z on my hardware:

http://www.cpuid.com/cpuz.php

and it gives me this:

Level 1 Data: 4 x 34 KBytes 8ways
Level 1 Inst.: 4 x 34 KBytes 8 ways
Level 2: 2 x 4096 KBytes 16 ways

So, mine has *2* L2 caches of 4M

So, when i am quicksorting a big array on 1 core ,
there is much higher cache misses than
on 2 threads/2 cores - cause i am breaking the array
in 4 parts and quicksorting in parallel, also ,
i have 2 x L2 caches of 4M - and you have only one
one on your E5500 - and this explains my results...

>But the biggest concern was, that the result files did not
>always >match. One or a few names were not sorted
>properly in some of the test runs.

As i have noticed, you are using this:

TestParallelSort(10000,1,4); //Tests 10000 names with 1 -> 4 cores.

The number of cores *MUST* be 2^N: 1,2,4 etc.

I have reproduced your problem with a number of cores
equal to 3.

So, you have to avoid *3*, example: 1,2,4,6,8,10,12,14,16 etc.

Try to use a number that is not 3 and that is 2^N,
example: 1,2,4,6,8 ,10,12,14,16 etc.


And you will be safe :)


Welcome: http://pages.videotron.com/aminer/


Sincerely,
Amine Moulay Ramdane.


From: aminer on


jarto wrote:
>Yes, I can reproduce the same problem with ctHeapSort.
> It can also be reproduced with Delphi 7.

http://www.lazarus.freepascal.org/index.php/topic,9359.msg46017/topicseen.html#new


Here it is jarto , i have solved the problem, you can
download version 2.11 from my website:

http://pages.videotron.com/aminer/


Please test it and tell me if all is ok...

Note: the problem was not on my algorithm , it was
very difficult to spot the problem ...


Sincerely,
Amine.






On May 11, 12:40 am, aminer <ami...(a)videotron.ca> wrote:
> jarto wrote:
> >Here's an updated version of my test program. In this one you
> >can specify the number of names as a command line argument.
> >It also saves the original names and results as text files as long
> >as number of names does not exceed 100000.
> >http://www.starsoft.fi/jarto/ParallelSortTest2.zip
> >With my Dell Latitute E5500 with 2,40GHz P8600 I got this kind of results:
> >Cores: 2
> >4000000 names sorted in 00:13:360
> >Cores: 1
> >4000000 names sorted in 00:24:000
> >But the biggest concern was, that the result files did not always match.
> >One or a few names were not sorted properly in some of the test runs.
>
> http://www.lazarus.freepascal.org/index.php/topic,9359.0.html
>
> As i have noticed , you are using this:
>
> Intel® Pentium® Processor E5500
> (2M Cache, 2.80 GHz, 800 MHz FSB)
>
> http://ark.intel.com/Product.aspx?id=42800
>
> It has only *1* L2 cache of 2M.
>
> I have used CPU-Z on my hardware:
>
> http://www.cpuid.com/cpuz.php
>
> and it gives me this:
>
> Level 1 Data: 4 x 34 KBytes  8ways
> Level 1 Inst.: 4 x 34 KBytes  8 ways
> Level 2: 2 x 4096 KBytes  16 ways
>
> So, mine has *2* L2 caches of 4M
>
> So, when i am quicksorting a big array on 1 core ,
> there is much higher cache misses than
> on 2 threads/2 cores - cause i am breaking the array
> in 4 parts and quicksorting in parallel, also ,
> i have 2 x L2 caches of 4M  - and you have only one
> one on your E5500 - and this explains my results...
>
> >But the biggest concern was, that the result files did not
> >always >match. One or a few names were not sorted
> >properly in some of the test runs.
>
> As i have noticed, you are using this:
>
> TestParallelSort(10000,1,4); //Tests 10000 names with 1 -> 4 cores.
>
> The number of cores *MUST* be 2^N: 1,2,4 etc.
>
> I have reproduced your problem with a number of cores
> equal to 3.
>
> So, you have to avoid *3*, example: 1,2,4,6,8,10,12,14,16 etc.
>
> Try to use a number that is not 3 and that is 2^N,
> example: 1,2,4,6,8 ,10,12,14,16 etc.
>
> And you will be safe :)
>
> Welcome:http://pages.videotron.com/aminer/
>
> Sincerely,
> Amine Moulay Ramdane.