From: Warren on 25 Mar 2010 13:21 Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 @h18g2000yqo.googlegroups.com: > On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote: > >> Another barrier I see to this is the high cost of >> starting a new thread and stack space allocation. > >> Somehow you gotta make thread startup and shutdown >> cheaper. > > Why? > > The problem of startup/shutdown cost and how many cores you have are > completely orthogonal. > I see no problem in starting N threads at the initialization time, use > them throughout the application lifetime and then shut down at the end > (or never). Yes, I am aware of that option. > If your favorite programming model involves lots of short-running > threads that have to be created and torn down repeatedly, then it has > no relation to multicore. It is just a bad resource usage pattern. > Maciej Sobczak * http://www.inspirel.com That's a rather sweeping statement to make ("bad resource usage pattern"). Unless there are leaps in language design, I believe that is what you will mostly get in automatic parallel thread generation. As humans we tend to think in sequential steps, and consequently code things. The media seems to suggest that we shouldn't have to change our mindset to do parallism (i.e. the compilers should arrange it for us). Certainly that would make a wish list item. I don't know much about Intel's hyper-threads, but I believe it was one approach to doing this (presumably largely without compiler help). So I can't buy into your conclusion on that. Warren
From: Warren on 25 Mar 2010 13:30 Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 @h18g2000yqo.googlegroups.com: > On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote: > >> Another barrier I see to this is the high cost of >> starting a new thread and stack space allocation. > >> Somehow you gotta make thread startup and shutdown >> cheaper. > > Why? > > The problem of startup/shutdown cost and how many cores you have are > completely orthogonal. > I see no problem in starting N threads at the initialization time, use > them throughout the application lifetime and then shut down at the end > (or never)... I forgot to mention that the disadvantage of this approach is that you have to "pre-allocate" stack space for each thread (whether by default amount or by a specific designed amount). If you used a true cactus stack, this is not an issue. But with a traditional thread, you could choose stack requirements at the point of thread creation. Not so, if you create them all up front. So there are downsides to this approach. Warren
From: Dmitry A. Kazakov on 26 Mar 2010 04:19 On Thu, 25 Mar 2010 17:30:05 +0000 (UTC), Warren wrote: > Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 > @h18g2000yqo.googlegroups.com: > >> On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote: >> >>> Another barrier I see to this is the high cost of >>> starting a new thread and stack space allocation. >> >>> Somehow you gotta make thread startup and shutdown >>> cheaper. >> >> Why? >> >> The problem of startup/shutdown cost and how many cores you have are >> completely orthogonal. >> I see no problem in starting N threads at the initialization time, use >> them throughout the application lifetime and then shut down at the end >> (or never)... > > I forgot to mention that the disadvantage of this approach is that > you have to "pre-allocate" stack space for each thread (whether > by default amount or by a specific designed amount). BTW, if this approach worked for an application, it should also do for the OS, e.g. why not to start all threads for all not yet running processes upon booting? If that worked, the effective observed startup time of a thread would be 0, and thus there would be nothing to care about. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: Maciej Sobczak on 26 Mar 2010 05:30 On 26 Mar, 09:19, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> wrote: > BTW, if this approach worked for an application, it should also do for the > OS, It is true, obtaining resources up-front requires more careful analysis of the problem that is being solve and is not always possible. The difference between application and OS is in the amount of knowledge about what the software will do and applications tend to know more than OS in this aspect. That is why it is more realistic to have applications allocating their resources during initialization phase than to see that at the OS level. I'm not a big fan of programs that allocate and deallocate the same resource repeatedly - this is an obvious candidate for caching and object reuse, where the cost of allocation is amortized. Fortunately, it is not even necessary for a user code to do that - think about a caching memory allocator, there are analogies. And the language standard does not prevent implementations from reusing physical threads, if they are used as implementation foundations for tasks. -- Maciej Sobczak * http://www.inspirel.com YAMI4 - Messaging Solution for Distributed Systems http://www.inspirel.com/yami4
From: Warren on 26 Mar 2010 15:35
Maciej Sobczak expounded in news:7b059d0f-791b-4ac9-bf64-c50448ec99f7(a)b30g2000yqd.googlegroups.com: ... > The difference between application and OS is in the amount of > knowledge about what the software will do and applications tend to > know more than OS in this aspect. Yes. > That is why it is more realistic to have applications allocating their > resources during initialization phase than to see that at the OS > level. I would generally agree with that, unless the cost of resource management was cleverly reduced. > I'm not a big fan of programs that allocate and deallocate the same > resource repeatedly - this is an obvious candidate for caching and > object reuse, where the cost of allocation is amortized. As a general principle this is right. But memory is another resource that sometimes needs careful management. With only 1 thread, you have a heap growing up to the stack and a stack that grows towards the heap. Either stack or heap can be huge (potentially at least), as long as both are not at the same time (overlapping). The moment you add 1 [additional] thread, you've now drawn the line in the sand for the lowest existing stack, and putting a smaller limit on it. This disadvantage is ok for probably most threaded programs, but perhaps not for a video rendering program that might hog resources on both heap and stack sides at differing times. In the end, the application programmer must plan this out, but this is a limitation that I dislike about our current execution environments. I suppose, just increasing the size of your VM address space, postpones the problem until we hit limits again. ;-) > Fortunately, > it is not even necessary for a user code to do that - think about a > caching memory allocator, there are analogies. And the language > standard does not prevent implementations from reusing physical > threads, if they are used as implementation foundations for tasks. > Maciej Sobczak * http://www.inspirel.com From an efficiency pov, this is all well and good. But if you want maximum dynamic allocation of heap+stack, then you might prefer fewer (if any) pre-allocated threads (implying additional stacks). Warren |