From: aminer on

Skybuck wrote in alt.comp.lang.borland-delphi:

> My Thread Pool Engine is not just an array of threads,
> "
>
>> To me it is.


You really don't know what you are talking about..


The principal threat to scalability in concurrent applications
is the exclusive resource lock.

And there are three ways to reduce lock contention:

1- Reduce the duration for which locks are held

2- Reduce the frequency with which locks are requested

or

3- Replace exclusive locks with coordination mechanisms that
permit greater concurrency.


With low , moderate AND high contention, my ParallelQueue
offer better scalability - and i am using it inside my
Thread Pool Engine - .


Because my ParallelQueue is using an hash based method
- and lock striping - and using just a LockedInc() , so,
i am REDUCING the duration for which locks are held AND REDUCING
the frequency with which locks are requested, hence i am
REDUCING A LOT the contention, so it's very efficient.

And as I stated before , and this is a law or theorem to apply:

[3] If there is LESS contention THEN the algorithm will
scale better. Due to the fact that S (the serial part)
become smaller with less contention , and as N become bigger,
the result - the speed of the program/algorithm... - of the
Amdahl's equation 1/(S+(P/N)) become bigger.


It's why my ParallelQueue have scored 7 millions of pop()
transactions per second... better than flqueue and RingBuffer

look at: Http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm

Also my Threadpool uses efficent lock-free queues -
example lock-free ParallelQueue - for each worker thread
- to reduce an minimize the contention - and it uses work-stealing
so my Thread Pool Engine is very efficient...

And it easy the work for you - you can 'reuse' the TThreadPool
Class...- and it is very useful...


So, don't be stupid skybuck...


http://pages.videotron.com/aminer/


Sincerely
Amine Moulay Ramdane


From: aminer on

I wrote:
> Because my ParallelQueue is using an hash based method
> - and lock striping - and using just a LockedInc() , so,
> i am REDUCING the duration for which locks are held AND REDUCING
> the frequency with which locks are requested, hence i am
> REDUCING A LOT the contention, so it's very efficient.

With low , moderate AND high contention, my ParallelQueue
offers better scalability - and i am using it inside my
Thread Pool Engine - .

And as you have noticed, i am using a low to medium contention
on the following test:

http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm


But i predict that on HIGH tcontention the push() and pop() will
score even better than that..

Why ?

Because my ParallelQueue is using an hash based method
- and lock striping - and using just a LockedInc() , so,
i am REDUCING the duration for which locks are held AND REDUCING
the frequency with which locks are requested, hence i am
REDUCING A LOT the contention, so it's very efficient.


And as I stated before , and this is a law or theorem to apply:


[3] If there is LESS contention THEN the algorithm will
scale better. Due to the fact that S (the serial part)
become smaller with less contention , and as N become bigger,
the result - the speed of the program/algorithm... - of the
Amdahl's equation 1/(S+(P/N)) become bigger.



Sincerely,
Amine Moulay Ramdane,
From: aminer on

Hello again,


Now as i have stated before:

[3] If there is LESS contention THEN the algorithm will
scale better. Due to the fact that S (the serial part)
become smaller with less contention , and as N become bigger,
the result - the speed of the program/algorithm... - of the
Amdahl's equation 1/(S+(P/N)) become bigger.


And , as you have noticed , i have followed this theorem [3] when
i have constructed my Thread Pool Engine etc...



Now there is another theorem that i can state like this:


[4] You have latency and bandwith , so, IF you use efficiently
one or both of them - latency and bandwidth - your algorithm
will be more efficient.


It is why you have to not start too many threads in my
Thread Pool Engine, so that you will not context switch a lot,
cause, when you context switch a lot, the latency will grow and
this is not good for efficiency ..


You have to be smart.


And as i have stated before:

IF you follow and base your reasonning on those theorems
- or laws or true propositions or good patterns , like theorem [1] ,
[2], [3],[4] ... -
THEN your will construct a model that will be much more CORRECT
and EFFICIENT.


Take care...


Sincerely,
Amine Moulay Ramdane.

From: aminer on

Hello again,


Sorry for my english , but i will continu to explain - my ideas etc.
-
using logic and reasonning...


As you already know, we have those two notions:


'Time' - we have time cause there is movement of matter -


and


'Space'


And we have those two notions that we call 'Correctness' and
'Efficiency'


And . as you have noticed, i have stated the following theorems...


[1] IF your algorithm exhibit much more data parallelism THEN
it will be much more efficient.


2] IF two or more processes or threads use the same critical
sections THEN they - the processes or threads - must take
them in the same order to avoid deadlock - in the system - .


3] If there is LESS contention THEN the algorithm will
scale better. Due to the fact that S (the serial part)
become smaller with less contention , and as N become bigger,
the result - the speed of the program/algorithm... - of the
Amdahl's equation 1/(S+(P/N)) become bigger.


[4] You have latency and bandwidth , so, IF you use efficiently
one or both of them - latency and bandwidth - THEN your algorithm
will be more efficient.


etc.


Why am i calling them theorems ?


You can also call them rules or true propositions, laws ...


Now i can 'classify' theorem [2] in the set that i call
'correctness',
and it states something on correctness..


And theorems [1] [3] [4] in the set that i call 'efficiency'.


, and they states something on efficiency.


But you have to be smart now..


If you have noticed, theorem [2] and [3] are in fact
the same as theorem [4]


But why am i calling them theorems ?


You can call them rules,laws... if you want.


And as i have stated before:


IF you follow and base your reasonning on those theorems
- or laws or true propositions or good patterns - like rules or
theorems


[1] , [2] , [3], [4]... - THEN your will construct a model that will
be much more


CORRECT and EFFICIENT.


It is one of my preferred methodology in programming.


Sincerely,


Amine Moulay Ramdane.


From: aminer on

Hello,

I am still thinking and using logic...

I can add the following rules also:

[5] IF you are using a critical section or spinlock and there is
a high contention- with many threads - on them THEN there is a
possibility of a Lock convoy. Due to the fact that the thread
entering the spinlock or critical section may context switch
and this will add to the service time - and to the S (serial part)
of the Amdahl's equation - and this will higher the contention and
create a possibility of a Lock convoy and to a bad scalability.

We can elevate the problem in [5] by using a Mutex or a Semaphore
around the crital section or the spinlock...

Another rule now..


[6] If there is contention on a lock - a critical section ... -
and inside the locked sections you are the I/O - example
loging a message to a file - this will lead the calling thread
block on the I/O and the operating system will deschedule
the blocked thread until the I/O completes, thus this situation
will lead to more context switching, and therefore to an increased
service time , and longer service times, in this case, means
more lock contention, and more lock contention means a bad
scalability.

there is also false sharing etc.


IF you follow and base your reasonning on those theorems
- or laws or true propositions or good patterns - like rules or
theorems [1] , [2] , [3], [4] , [5], [6]... - THEN your will construct
a model
that will be much more CORRECT and EFFICIENT.

And it is one of my preferred methodology in programming.

I will try to add more of those rules , theorems , laws... in next
time...



Sincerely,
Amine Moulay Ramdane.