How many hardware threads? [ADA]

Prev: ANN: Simple components for Ada v3.9
Next: gnatbench 2.4 : small question

From: Peter C. Chapin on 11 Jul 2010 20:47

As we all know there is currently a lot of "buzz" about parallel
programming and about the "crisis in software development" that is
associated with it. People using languages with weak support for
concurrency are now wondering how they will use multi-core systems
effectively and reliably. Of course Ada already has decent support for
concurrency so part of the problem is solved for those of us using Ada.

But not all of the problem...

I see a couple of difficulties with writing effective parallel programs
for "ordinary" applications (that is, applications that are not
embarrassingly parallel). One difficulty is load balancing: how can one
decompose a problem to keep all processors reasonably busy? The other
difficulty is scalability: how can one design a single program that can
use 2, 4, 16, 128, or more processors effectively without knowing ahead
of time exactly how many processors there will be? I'm not an expert in
Ada tasking but it seems like these questions are as big a problem for
Ada as they are for any other language environment.

I'm not looking for a solution to all tasking problems here. But there
is one feature that seems like a necessary prerequisite to such a
solution. The language (or its standard library) needs to provide a
portable way for the program to determine how many hardware threads are
available.

I'm about to write a simple program that decomposes into parallel,
compute-bound tasks quite nicely. How many such tasks should I create? I
could ask the user to provide the number as a command line argument or
in a configuration file. Yet it seems like the program should just be
able to figure it out. Does Ada have a standard way of doing that? I
didn't see anything in my (admittedly short) review.

Thanks!

Peter

From: Jeffrey R. Carter on 11 Jul 2010 22:31

On 07/11/2010 05:47 PM, Peter C. Chapin wrote:
>
> I'm about to write a simple program that decomposes into parallel,
> compute-bound tasks quite nicely. How many such tasks should I create? I
> could ask the user to provide the number as a command line argument or
> in a configuration file. Yet it seems like the program should just be
> able to figure it out. Does Ada have a standard way of doing that? I
> didn't see anything in my (admittedly short) review.

No, it doesn't, and I've long thought it should.

GNAT has function System.Task_Info.Number_Of_Processors. But something standard
would be better.

--
Jeff Carter
"I feel as though somebody stepped on my tongue
with muddy feet."
Never Give a Sucker an Even Break
112

From: tmoran on 11 Jul 2010 22:38

> I'm about to write a simple program that decomposes into parallel,
> compute-bound tasks quite nicely. How many such tasks should I create? I
> could ask the user to provide the number as a command line argument or
> in a configuration file. Yet it seems like the program should just be
> able to figure it out. Does Ada have a standard way of doing that?
In MS Windows you can call GetProcessAffinityMask and count the ones
in your process's mask.
But just knowing how many CPUs there are is just the beginning. Are
there other programs running on the machine and should you leave some CPUs
for them, or grab them all for your program? What about cache? It tends
to be small and if you (or your program plus any others running on the
system) actually try to use all the CPUs simultaneously you may thrash.
Thread overhead goes up with thread count. The best algorithm may not be
the most straightforward.
In the early days of virtual memory some people thought "I have a giant
memory, which I can access as I wish". They soon found they needed to
consider access patterns or their program could thrash horribly.
Similarly for multi-core CPUs.

From: Dmitry A. Kazakov on 12 Jul 2010 03:40

On Sun, 11 Jul 2010 20:47:41 -0400, Peter C. Chapin wrote:

> I see a couple of difficulties with writing effective parallel programs
> for "ordinary" applications (that is, applications that are not
> embarrassingly parallel). One difficulty is load balancing: how can one
> decompose a problem to keep all processors reasonably busy? The other
> difficulty is scalability: how can one design a single program that can
> use 2, 4, 16, 128, or more processors effectively without knowing ahead
> of time exactly how many processors there will be? I'm not an expert in
> Ada tasking but it seems like these questions are as big a problem for
> Ada as they are for any other language environment.

Back in 90's, during the era of transputers, concurrent algorithms were
decomposed knowing in advance the number of processors and the topology of
the network of. (Unlike to multi-cores the transputers didn't share memory,
they communicate over serial links connected physically) That time the
consensus was that the problem is not solvable in general. So you designed
up front both the algorithm and the topology.

> I'm not an expert in
> Ada tasking but it seems like these questions are as big a problem for
> Ada as they are for any other language environment.
> I'm not looking for a solution to all tasking problems here. But there
> is one feature that seems like a necessary prerequisite to such a
> solution. The language (or its standard library) needs to provide a
> portable way for the program to determine how many hardware threads are
> available.

Well, maybe, but I don't think it would bring much. Especially because
normally cores support multi-tasking. It would be more important for the
architectures with the cores that do not (GPU etc).

BTW, "hardware thread" = core? processor? ALU + an independent memory
channel? etc. It is quite difficult to define and the algorithm's
performance may heavily depend on the subtleness. ARG would say, look, it
does not make sense for all platforms, forget it.

> I'm about to write a simple program that decomposes into parallel,
> compute-bound tasks quite nicely. How many such tasks should I create?

Back in time it was popular to make it adaptive. I.e. you monitor the
performance and adjust the size of the working threads pool as you go. I
remember some articles on this topic, but it was long long ago...

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

From: anon on 12 Jul 2010 06:08

In <4c3a65d7$0$2405$4d3efbfe(a)news.sover.net>, "Peter C. Chapin" <pcc482719(a)gmail.com> writes:
>As we all know there is currently a lot of "buzz" about parallel
>programming and about the "crisis in software development" that is
>associated with it. People using languages with weak support for
>concurrency are now wondering how they will use multi-core systems
>effectively and reliably. Of course Ada already has decent support for
>concurrency so part of the problem is solved for those of us using Ada.
>
>But not all of the problem...
>
>I see a couple of difficulties with writing effective parallel programs
>for "ordinary" applications (that is, applications that are not
>embarrassingly parallel). One difficulty is load balancing: how can one
>decompose a problem to keep all processors reasonably busy? The other
>difficulty is scalability: how can one design a single program that can
>use 2, 4, 16, 128, or more processors effectively without knowing ahead
>of time exactly how many processors there will be? I'm not an expert in
>Ada tasking but it seems like these questions are as big a problem for
>Ada as they are for any other language environment.
>
>I'm not looking for a solution to all tasking problems here. But there
>is one feature that seems like a necessary prerequisite to such a
>solution. The language (or its standard library) needs to provide a
>portable way for the program to determine how many hardware threads are
>available.
>
>I'm about to write a simple program that decomposes into parallel,
>compute-bound tasks quite nicely. How many such tasks should I create? I
>could ask the user to provide the number as a command line argument or
>in a configuration file. Yet it seems like the program should just be
>able to figure it out. Does Ada have a standard way of doing that? I
>didn't see anything in my (admittedly short) review.
>
>Thanks!
>
>Peter

The normal programs should never need to know how many processors or if
the processor on this machine or another. Because the job-scheduler
sub-system of todays OS decides what processor to place the task and
tries to maintain a balance load. And in some of these systems they
create anywhere from 1 to 32 or more virtual systems out of one or more
processor cores so it might be a little too OS demanding research and
programming for a initial parallel program. Plus, how do you deal with
the other OS tasks and sub-system lplus the other user background tasks.
So, unless you are design a system that requires a number of processors I
would skip trying to control processor usages for now. Get the program
working first.

As for the number of tasks and memory size. Well the number of task
depends on your "Algorithm and Memory" requirements. Data memory for
each task should be keep at a system page size to reduce number of page
swapping during task switching. In calculating: the total memory requirements
divided by system page size should give you a idea of the number of tasks
that you can use.

Note: back in the 80s and 90s there were a number of languages that would
use the "Algorithm and Memory" requirements as well as other factors during
compiling to build a balance number of task. A lot of these languages were
design to be used on a Cray system with only a small OS task and
sub-systems running.

| Next | Last
Pages: 1 2 3
Prev: ANN: Simple components for Ada v3.9
Next: gnatbench 2.4 : small question