Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Joseph M. Newcomer on 13 Apr 2010 13:39

See below...
On Mon, 12 Apr 2010 19:46:38 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:4ot6s5lt9a53uocku5ga06pjc5sq2rc4ht(a)4ax.com...
>> See below...
>> On Sun, 11 Apr 2010 19:50:43 -0500, "Peter Olcott"
>> <NoSpam(a)OCR4Screen.com> wrote:
>
>>>> Watch carefully: 1 + 1 = 2. 2 + 2 = 4; 1 / 4 = 0.25.
>>>> Read the third-grade arithmetic
>>>> that I used to demonstrate that a SQMS architecture
>>>> scales
>>>> up quite well, and maximizes
>>>
>>>I don't think that it makes sense on a single core machine
>>>does it? It is reasonable to postulate that a quad core
>>>machine might benefit from an adapted design. I will not
>>>have a quad core machine. If I had a quad-core machine it
>>>might be likely that your SQMS would make sense.
>> ****
>> But it works better on a single-core machine because of
>> (and again, I'm going to violate
>> my Sacred Vows of Secrecy) "time slicing".
>
>So Linux thread time slicing is infinitely superior to Linux
>process time slicing?
****
I do not distinguish between threads in a single process and threads which are in
different processes. This distinction apparently exists only in your own mind, probably
caused by the fact that you have confused the pseudo-threads library with real threads.
*****
>
>One of my two options for implementing priority scheduling
>was to simply have the OS do it by using Nice to set the
>process priority of the process that does the high priority
>jobs to a number higher than that of the lower priority
>jobs.
****
This has system-wide implications, and can interfere with the correct behavior of every
other task the system is managing. This includes your Web server, and any other process
the system is running. YOu have to be EXTREMELY careful how you muck around with thread
priorities.
joe
****
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 13 Apr 2010 13:44

"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
news:MPG.262e563b90cdf67c989875(a)news.sunsite.dk...
> In article <012bc856-a14b-41ad-99a3-ae05393bbfc0
> @z4g2000yqa.googlegroups.com>, sant9442(a)gmail.com says...
>>
>> Good example Jerry. I would of done a few things
>> differently.
>
> The real question is which work load is a more accurate
> simulation of
> Peter's OCR engine. Mine simulates what Peter has
> *said* -- that it's
> completely CPU bound. In all honesty, that's probably not
> correct, so
> yours is probably a more accurate simulation of how it's
> likely to
> work in reality.

The Task manager shows a solid 25% on my quad-core, so it
seems that unless the task manager is lying then it really
is CPU bound.

>
>> Second, I would get the baseline count rates (msecs/count
>> increment)
>> for equal priority threads, then change the priority of
>> one to
>> compare rate differences.
>
> Good point, and one I intended (but forgot) to mention --
> though I
> didn't post the code both ways, I actually ran what you
> suggest. If
> you simply comment out the "SetThreadPriority", the two
> threads, of
> course, run at the same priority. It probably wouldn't
> hurt, however,
> to run them once with the same priority, and once with the
> priorities
> adjusted.
>
> --
> Later,
> Jerry.

From: Joseph M. Newcomer on 13 Apr 2010 13:45

See below....
On Mon, 12 Apr 2010 23:22:21 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
>news:MPG.262d7a6da11542c4989872(a)news.sunsite.dk...
>> In article
>> <pYidndO7AuRyI17WnZ2dnUVZ_rednZ2d(a)giganews.com>,
>> NoSpam(a)OCR4Screen.com says...
>>
>> [ ... ]
>>
>>> So Linux thread time slicing is infinitely superior to
>>> Linux
>>> process time slicing?
>>
>> Yes, from the viewpoint that something that exists and
>> works (even
>> poorly) is infinitely superior to something that simply
>> doesn't exist
>> at all.
>
>David Schwartz from the Linux/Unix groups seems to disagree.
>I can't post a google link because it doesn't exist in
>google yet. Here is the conversation.
****
I don't see anything here that matters; he explains that there is a very complex
scheduling mechanism (which, apparently, you can completely predict the behavior of), and
he does not mention the effects of playing with priorities would have on the rest of the
system (and if you have a closed-form analytic model that lets you predict this perfectly,
this is another reason you should enroll in a PhD program, because nobody else has such a
methodology available, so you should get a PhD for being able to show this). If you do
not have such a closed-form solution, you have no "sound reasoning" to base your decisions
on.

What I see below is a detailed handwave on how the linux scheduler works. But nothing
that is really useful to tell you what priorities to set. Or what will happen if you set
them.
****
>
>> Someone told me that a process with a higher priority will
>> almost starve any other process of a lower priority, is
>> this
>> true? If it is true to what extent is this true?
>
>There are basically two concepts. First, the "priority" of a
>process
>is a combination of two things. First is the "static
>priority". That's
>the thing you can set with "nice". Second is the dynamic
>priority. A
>process that uses up its full timeslices has a very low
>dynamic
>priority. A process that blocks or yields will tend to have
>a higher
>dynamic priority. The process' priority for scheduling
>purposes is the
>static priority, adjusted by the dynamic priority.
>
>If a process becomes ready-to-run, it will generally
>pre-empt any
>process with a lower scheduling priority. However, if it
>keeps doing
>so, its dynamic priority will fall. (And then it will only
>continue to
>pre-empt processes with a lower static priority or processes
>that burn
>lots of CPU.)
>
>Among processes that are always ready-to-run, the old rule
>was to give
>the CPU to the highest-priority process that was
>ready-to-run.
>However, almost no modern operating system follows this rule
>(unless
>you specifically request it). They generally assign CPU time
>proportionately giving bigger slices to higher priority
>processes.
>
>DS
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Joseph M. Newcomer on 13 Apr 2010 13:47

See below...
On Mon, 12 Apr 2010 19:39:54 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:acs6s59011mhn54fbp4sbbttiegs2t6o4f(a)4ax.com...
>> See below...
>> On Mon, 12 Apr 2010 09:47:29 -0500, "Peter Olcott"
>> <NoSpam(a)OCR4Screen.com> wrote:
>>
>
>> How is a single-core 2-hyperthreaded CPU different
>> logically than a 2-core
>> non-hyperthreaded system (Hint: the hyperthreaded machine
>> has about 1.3x the performance
>> of a single-core machine but the dual-processor system has
>> about 1.8x the performance).
>> But logically, they are identical! The reduction in
>> performance is largely due to
>> cache/TLB issues
>
>There you go sounding reasoning. I didn't know that, but,
>the reasoning makes sense.
****
You could have found all this out on your own. It has been known for years, since the
first hyperthreaded machines came out.
joe
****
>
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 13 Apr 2010 14:15

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:2b99s5pkl1576hhf5pel2a90of7mka47q6(a)4ax.com...
> See below...
> On Mon, 12 Apr 2010 20:09:08 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>Four different types of jobs, one of these is to have (as
>>much as possible) absolute priority over all the others,
>>every job must be processed in strict FIFO order within
>>its
>>priority. The whole system should be as efficient as
>>possible.
> ****
> How does MQMS make this "as efficient as possible" and
> still scale to cover massive
> incoming sequences of requests?

First "as efficient as possible" is a design goal, because
of diminishing returns a reasonable approximation of this
will be good enough.

A maximum of four processes are each running a maximum of
one job from each of the four queues. If any one or more
these job types are unavailable (their queue is empty) then
the remaining job types each get proportionally more CPU.
The relative scheduling to these four jobs (as much as
possible) provides one job type with approximately absolute
priority over the others. The remaining jobs absorb most of
the context switching overhead.

> ****
>>
>>I don't think that SQMS using threads can do that as well
>>as
>>MQMS using processes because Linux threads are reported to
>>not work as well as Linux processes. I don't know a good
>>way
>>to make SQMS work well with multiple processes. The whole
>>purpose of the MQ is to make communicating with Multiple
>>processes simple.
> ***
> You seem to have some hangup where you think threads have
> to be in the same process; I
> have constantly said threads are threads whether they are
> in the same process or different
> processes. I just don't feel like making this elaborate
> artificial distinction every time

The semantics and syntax is entirely different, especially
on Linux.

>>The cache spatial locality of reference will likely be
>>ruined.
> ****
> Really? You have measurements that indicate this is an
> actual problem? Or did the Tarot
> cards reveal this? Did you account for the thousands of
> interrupts-per-second that real

The only enormously memory bandwidth intense processes will
be the OCR processes, thus it is very reasonable to
tentatively conclude (awaiting empirical validation) that
these could easily screw up the cache locality of references
relative to each other.

> operating systems deal with and their impact on cache
> management? Did you wonder about
> what happens when the Web server run? You get this
> incredibly low-level performance
> buzzword-lock conditions, and think you have a solution
> based upon nothing at all except
> what you call "sound reasoning" and the rest of us know is
> merely augury, since we know
> what REALLY happens in REAL operating systems; I guess it
> is because PFOS takes no
> interrupts when an "important" process is running and
> therefore there can be no cache
> pollution from the interrupt handlers, file system
> operations, etc. Also, no processing
> of I/O, no handling of the mouse, no handling of network
> traffic, etc., which is going to
> be REAL interesting when that 3-minute job runs...no mouse
> movement, no keyboard response
> at all, for 3 minutes, then just MAYBE a brief flurry
> while the next 3-minute job is
> started, then no response for another 3 minutes...
>
> The last I saw a measure on one of my operating systems,
> it was processing 1400
> interrupts/second while doing real work.

And how many of these needed constant access to multiple
megabytes?

> GET SOME DAMNED DATA AND QUIT FIXATING ON TRIVIA!!!! In
> the Real World, we call decisions
> based on actual measurement "sound reasoning". We do not
> consider the I Ching, Ouija
> boards, or Tarot cards as "sound reasoning", yet you have
> provided no evidence you have a
> clue as to how you are making these various decisions. So
> we can only surmise, based on
> their disconnection from reality, that you are trying some
> kind of psychic methods.
> ****

I can not test every possible combination (including absurd
ones) because this would take an eternity. I form a
hypothesis then see what the consequences are if this
hypothesis is true, and when I find that there are no
significant consequences there is no need to build a system
to verify whether or not they are true.

In the case where a true hypothesis can have significant
consequences, then I also think through the alternatives in
case this hypothesis is true. Often times one of these
alternatives is better than the original design so yet again
there is no sense building a system to test the hypothesis.

I do this because changing my mind is many thousands-fold
less labor intensive than changing code. Even building code
to test unnecessary hypotheses is a waste of time.
Ultimately near the end of design some alternatives will
need to be empirically validated.

First | Prev | Next | Last
Pages: 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system