Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Peter Olcott on 10 Apr 2010 10:55

"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
news:MPG.262a3ce2db0955b498985d(a)news.sunsite.dk...
> In article
> <2IydnTUFcvDQ2CLWnZ2dnUVZ_rmdnZ2d(a)giganews.com>,
> NoSpam(a)OCR4Screen.com says...
>
> [ ... ]
>
>> I am going to use four processes for the four different
>> levels of process priority. The High priority jobs will
>> be
>> assigned something like 80% of the CPU (relative to the
>> low
>> priority jobs) and the low priority jobs will each get
>> something like 7% of the CPU. Each of these processes
>> will
>> pull jobs from each of four FIFO queues. Some sort of
>> port
>> access might be better than a named pipe because each
>> port
>> access could be explicitly acknowledged when it occurs.
>> Shoving something in a named pipe does not provide this
>> benefit.
>
> This is a really bad idea. What you almost certainly want
> is a
> priority queue to hold the incoming tasks, with the
> "priority" for
> the queue based on a combination of the input priority and
> the
> arrival time. This will let a new input move ahead of some
> lower
> priority items as long as they arrived recently enough
> (for some
> definition of "recently enough") but also guarantees that
> a low
> priority task won't sit in the queue indefinitely -- at
> some point,
> it'll get to the front of the queue and be processed.
>
> This is simple, straightforward to implement, and fairly
> easy to be
> sure it works correctly. Despite it's *seeming*
> simplicity, the
> method you've outlined is none of the above -- quite the
> contrary,
> it's a recipe for putting another 10 years (or more) of
> work into
> getting your process synchronization to work.
>
> --
> Later,
> Jerry.

I don't see why the processes would have to synchronize at
all each one handles exactly one job and their is no
dependency that I am aware of either between jobs, or on any
shared resource.

My other idea was to have the lower priority processes
simply yield, putting themselves to sleep when any higher
priority job arrives.

I have got opinions on both sides of this where each expert
thinks that the other's idea is simply stupid without ever
bothering to completely explain what is stupid about it. In
both cases the local expert said that the other expert's
idea was bad because of priority inversion.

When each of these experts has equal credibility, I must
have a lot more detailed explanation as the tie breaker.

From: Hector Santos on 10 Apr 2010 14:40

Peter Olcott wrote:

> "Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
> news:MPG.262a3ce2db0955b498985d(a)news.sunsite.dk...
>> In article
>> <2IydnTUFcvDQ2CLWnZ2dnUVZ_rmdnZ2d(a)giganews.com>,
>> NoSpam(a)OCR4Screen.com says...
>>
>> [ ... ]
>>
>>> I am going to use four processes for the four different
>>> levels of process priority. The High priority jobs will
>>> be
>>> assigned something like 80% of the CPU (relative to the
>>> low
>>> priority jobs) and the low priority jobs will each get
>>> something like 7% of the CPU. Each of these processes
>>> will
>>> pull jobs from each of four FIFO queues. Some sort of
>>> port
>>> access might be better than a named pipe because each
>>> port
>>> access could be explicitly acknowledged when it occurs.
>>> Shoving something in a named pipe does not provide this
>>> benefit.
>> This is a really bad idea. What you almost certainly want
>> is a
>> priority queue to hold the incoming tasks, with the
>> "priority" for
>> the queue based on a combination of the input priority and
>> the
>> arrival time. This will let a new input move ahead of some
>> lower
>> priority items as long as they arrived recently enough
>> (for some
>> definition of "recently enough") but also guarantees that
>> a low
>> priority task won't sit in the queue indefinitely -- at
>> some point,
>> it'll get to the front of the queue and be processed.
>>
>> This is simple, straightforward to implement, and fairly
>> easy to be
>> sure it works correctly. Despite it's *seeming*
>> simplicity, the
>> method you've outlined is none of the above -- quite the
>> contrary,
>> it's a recipe for putting another 10 years (or more) of
>> work into
>> getting your process synchronization to work.
>>
>> --
>> Later,
>> Jerry.
>
> I don't see why the processes would have to synchronize at
> all each one handles exactly one job and their is no
> dependency that I am aware of either between jobs, or on any
> shared resource.

So we are back to separated processes which is the real the only
design you are capable of producing because

a) You have to write pthreads for linux, and thats hard.
b) You have to write pmaps for linux, and thats hard.

The two things that you only need (Threads and Shared Mapped files)
whether it was for Windows or Linux. You can't produce it so you go
the route you can only produce.

> My other idea was to have the lower priority processes
> simply yield, putting themselves to sleep when any higher
> priority job arrives.

And how will it determine this? This is way to hard for you to
write. So why do you keep talking with the experts about
synchronization and scaling designs, both practical and ideal, yet you
never have any intention or capability of doing it? You got to be
real with yourself. To me, you shoot out with obviousness.

> I have got opinions on both sides of this where each expert
> thinks that the other's idea is simply stupid without ever
> bothering to completely explain what is stupid about it. In
> both cases the local expert said that the other expert's
> idea was bad because of priority inversion.
>
> When each of these experts has equal credibility, I must
> have a lot more detailed explanation as the tie breaker.

Oh stop. With the way you mix things up, even experts can be confused
as to what the hell you want to get advice on, and when its provided
in more ways than one, the reality is that its all for nothing because
at the end of the day, you have:

pure C code, most likely, not written by you, a single process,
global variables, not thread safe, and you don't have the
programming tenacity to do 1/8 of the the things that is
theoretically and practically necessary to get the job done.

You may say that it is not the truth. But between you, me, and that
white picket fence post, you know it is very close to the truth
otherwise NO ONE in the right mind would be exhibiting the behavior
you have shown.

Every concept discussed here can be explored by the normal designer
programmer even for the people who never did it, but they do have the
tenacity to explore. You have not done that, except when I pushed
you hard to try the thread simulator. But that right there showed you
don't know how do do threads, so you took the basic non-threaded idea
part of it to test the loading of large arrays. And even at this
simple level, you finally learned something about virtual memory for
LINUX! So even when you didn't explore this with threads, you got
some small piece of it it. You need to roll up your sleeves and
explore the next level of ideas necessary.

The "experts" gave you everything you needed to know. Implementations
can differ, but here, for the most part, everyone was pretty
consistent at the overall needs you require.

It is you that are in denial of the fact that you can't program these
ideas and if you told people this, then they must give you even more
ideas, like I did, for single process ideas.

So we are back to square one, four separate processes. I assume that
at least know you know that you have a requirement now of:

1) Shared mapped file/memory for the 1.5 to 5GB read only meta data?

2a) Four separate web servers?

- 4 separate Listening IPs? 4 Domains?

2b) One web server (PROXY) with request/response HTTP IPC with
4 processors?

2c) One web server (PROXY) with request/response Named Piped IPC
with 4 processors?

3) Handle all this with a 10ms turn around time?

And thats a small part off the top of my head.

I know you will say that you can deal with 1.5GB per process and you
don't care for the 30-60 second load time. Thats fine.

But how do handle the Many Threads to One Thread queue processing?
Can *you* do this with mongoose?

You have a very simple model:

WORK LOAD (ms) PER WORKER

(RPT+PPT) = 1000*N/TPS
TPT = 1000*N/TPS

where

PPT = is your Processor work time per transaction
RPT = is your request work time per transaction (minus PPT)
TPT = RPT+PPT
TPS = is your transactions per second
N = is the # of workers required

You can use this for the whole request types or individually.

I put this into a spread sheet in google doc

http://spreadsheets.google.com/pub?key=tbvdBhqfu0HK04EjX1MIuEw&output=html

Take a look at the last work load table. This tells you after you
find out what TPT (total work load) is really is, how many machines
you need for the target TPS. So if you see the TPT is 80 ms, than you
need 8 workers for 100 TPS.

There is no way you can have a PPT without no RPT work time. You
can see that with the 2nd table. Even if you give the RPT a low 10ms
under Linux (which you MUST admit is really unrealistic), you will
need at least 2 machines for a TPS of 100.

--
HLS

From: Hector Santos on 10 Apr 2010 15:20

Peter Olcott wrote:

> Also I will go ahead and use a transacted database for all
> persistent storage. SQLite is supposed to be good for up to
> 500 transactions per second, and I only need 100. I just
> have to make sure that the reliability caveats that SQLite
> mentions are covered. I am guessing that SQLite might even
> be smart enough to cover the one efficiency aspect that I
> was concerned about. It may be able to directly seek using
> record number and record size to derive the file byte
> offset. In any case I won't worry about this either.

Take a step back into the balcony.

1) 100 TPS means you a *total work load* is 10 ms per request

You will not be able to get this done in 10 ms. If you don't believe
this, than you are in for a big heart breaking surprise.

2) SQLITE locks the data file hence the database during updates, so
all new incoming request will be BLOCKED during updates from other
already request in the process. That right there gives you more
delays, contentions issues.

3) SQLITE is a SQL database system. Even though behind the scenes it
uses a ISAM/BTREE system, YOU don't have access to this. You might be
able to write some hooks using their virtual access API, but I
sincerely doubt working at the record and BYTE level is prohibited and
YOU would BE STUPID to do so. You might was well use a pure ISAM file
for this. Get another database management API for this. SQLITE3 is
not what you want if you need file level access.

Your problem is that you are stuck with a 100 TPS which is FAR too
much for you.

100 TPS is 6000 per minute, 36,000 per hour, 288,000 per 8 hour work
day! You are OUT of your mind if you think your 10 year outdated
low-desired OCR idea is that desirable today.

The point is not really that you can reach this, but that it means you
need a 10 ms total turn around time per transaction! And that is 10
ms you say is the OCR processor time only - you totally ignored the
time required for everything else - INCLUDING the SQL engine - pick
one, there is NO WAY you can do that in less than 1 quantum. The
hardware interrupts alone will break your 10 ms theory.

So you need to get REAL and stop this fantasy ideal design of 10 ms
throughputs.

--
HLS

From: Hector Santos on 10 Apr 2010 15:42

Peter Olcott wrote:

>> Ohh, so you NO LONGER CARE about either power failure or
>> operating system crash?
>
> Hector thinks its a good idea and you don't think that it is
> a good idea so I can't go by credibility because although
> you have lots of experience and a PhD, Hector has more
> experience and more recent experience in this area. Because
> of this I weight your credibility equal with his, thus I
> need a tie breaker.
>
> The obvious tie breaker (the one that I always count on) is
> complete and sound reasoning. From the sound reasoning that
> you have provided, this would go against your point of view.
> I know from the SQLite design pattern how to make disk write
> 100% reliable. I also know all about transactions. By
> knowing these two things a simple binary file can be made to
> protect against power failures.

You mixing things up. A share file was brought up because honestly
that is all YOU need, and I emphasize YOU. SQLITE3 is a shared file
too. So it can work too. MySQL and other engines require you through
some connector, it could be ODBC, you can also use a specific API for
them too, like MySQL/Connector is faster than MySQL/ODBC. But SQLITE3
differs because it is only a API to a file. SQLITE3 does have a 3rd
party ODBC driver. But that somewhat defeats the purpose of using
SQLITES for isolated applications with single accessor needs.

You are trying to be a BIG BOY multi-accessor application trying to
use SQLITE3 in a big boy manner - YOU CAN NOT - not without shallowing
the delays and blocks when multiple read/write accessors occur which
there is no way you can get around. Check the SQLITE3 user archives
with this always raises desire only to be disappointed that the light
weight free4 SQLITE3 does not give you BIG BOY power. SQLITE3 is
excellent for a SINGLE applet where it has single source control of
I/O and there is NO concern about contention because its not part of
their design, and even if multi-threads is allowed, the blocking delay
issues is not a concern.

The idea for you. I said that even you think SQLITE3 is good enough
for you, where the blocking delays is not a concern, then you might as
well use a straight shared direct access ISAM file where you can have
100% full control and STILL get record and byte level access. SQLITE3
will not give you direct file record/byte level buffered access - not
like you are thinking in a non-SQL database world. So a Share Direct
File is better for you.

--
HLS

From: Joseph M. Newcomer on 10 Apr 2010 17:10

See below...
On Fri, 9 Apr 2010 20:36:34 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:4lmur51ea3nju0dnl7ms6vcurv9f0q9nlc(a)4ax.com...
>> See below...
>> On Thu, 08 Apr 2010 22:16:14 -0400, Hector Santos
>> <sant9442(a)nospam.gmail.com> wrote:
>>
>>>> Some of the above have fixed queue lengths don't they?
>>>
>>>
>>>No, because the question doesn't apply and I doubt you
>>>understand it,
>>>because you have a very primitive understanding of queuing
>>>concepts.
>>>No matter what is stated, you don't seem to go beyond a
>>>basic layman
>>>abstract thinking - FIFO. And your idea of how this
>>>"simplicity" is
>>>applied is flawed because of the lack of basic
>>>understanding.
>> ***
>> Note that I agree absolutely with this! The concept that
>> a fixed-sized queue matters at
>> all shows a total cluelessness.
>
>Bullshit. At least one of the queuing models discards input
>when queue length exceeds some limit.
***
So throwing away information makes sense? Note that using a third-party queue (as might
be supplied by a named pipe, for example) does not absolve your design of coping with
information loss, no matter who does it.

You have to carefully evaluate the policies of every component to make sure your
requirements will be satisfied. Suppose your "growing" queue starts discarding
information, e.g., rejecting requests because it is full? What are you going to do? And
how does this behavior differ from a queue with a small fixed size (hint: it is the same
problem, with the same solution, so queue capacity is no longer an issue)
joe
****
>
>>>There were plenty of links where people had issues - even
>>>for LINUS
>> ****
>> If you ignore the issue of what happens if either side of
>> the pipe fail, or the operating
>> system crashes. But hey, reliability is not NEARLY as
>> important as having buffer lengths
>> that grow (if this is actually true of linux named pipes).
>> This is part of the Magic
>> Morphing Requirements, where "reliability" got replaced
>> with "pipes that don't have fixed
>> buffer sizes".
>> ****
>
>The other issue is reentrancy. I remember from my MS-DOS
>(ISR) interrupt service routine development that some
>processes are occasionally in states that can not be
>interrupted. One of these states is file I/O. Now the whole
>issue of critical sections and other locking issues has to
>be dealt with. A simple FIFO made using a named pipe
>bypasses these issues.
***
By definition, NO process is EVER in a non-interruptible state, in either linux or
Windows. File I/O is ABSOLUTELY interruptible in Windows, NO EXCEPTIONS. If you believe
otherwise, you are fooling yourself. But it is probably the fact that I have written a
book on developing WIndows device drivers, and teach courses in it, and have talked to
linux device driver writers (who are often my students) that gives me the knowledge to
know you are spouting nonsense. And I should point out that a simple FIFO using a named
pipe that has the magical property of growing indefinitely, or the failure mode ot
discarding data, is NOT the solution to this non-problem. If you don't understand why you
have to deal with it, no matter what the FIFO characteristics, you are not going to end up
with an implementation that meets your requirements.
****
>
>>>
>>>For what you want to use it for, my engineering sense
>>>based on
>>>experience tells me you will have problems, especially YOU
>>>for this
>>>flawed design of yours. Now you have 4 Named Pipes that
>>>you have to
>>>manage. Is that under 4 threads? But you are not
>>>designing for
>>>threads.
>
>That right I discarded threads in favor of processes a long
>time ago.
****
Note that a process IS a thread. It happens to have a private address space, but the
threads in those processes behave just like threads in a single process, a point you have
obviously missed. You have to manage 4 named pipes with 4 threads; the fact that those
threads are in separate processes does not change the nature of the problem, or change the
fundamental weaknesses of the design.
*****
>
>> the message yes, another no. Is the 1 OCR process going to
>>>handle all four pipes? Or 4 OCR processes? Does each
>>>OCR have their
>>>own Web Server? Did you work out how the listening
>>>servers will bind
>>>the IPs? Are you using virtual domains? sub-domains?
>>>Multi-home IP
>>>machine?
>
>One web server (by its own design) has multiple threads that
>communication with four OCR processes (not threads) using
>some form of IPC, currently Unix/Linux named pipes.
****
You have missed so many critical issues here that I would be suprised if you could ever
get this to work in a satisfactory manner.

You are assuming the named pipes have infinite capacity, which they will not. You do not
seem to have a plan in place to deal with the inevitable failure that will occur because
the pipes do not have infinite capacity.

You think a handwave that uses threads in separate processes somehow magically makes the
multithreading concurrency issues disappear. They will not. And your design for
scheduling OCR processing sucks. It allows almost no concurrency, thus maximizing latency
time and severely underutilizing the available computing resources.

This is because you are letting your misconceived implementation ideas drive the
requirements, instead of writing requirements which allow multiple implementations, almost
any one of which, randomly selected, would be better than the implementation you are
proposing.

****
>
>> ****
>> The implmenetation proposals have so many holes in it that
>> they would be a puttter's
>> dream, or possibly a product of Switzerland. This design
>> guarantees maxium conflict for
>
>And yet you continue to fail to point out the nature of
>these holes using sound reasoning. I correctly address the
>possible reasoning, and you simply ignore what I say.
****
I largely ignore what you say because largely what you say is either bad design or
completely nonsensical. I have tried to point out the reasoning, such as a multiqueue
model failing to maximiz concurrency, failing to minimize latency, and failing to utilize
all of the computing resource. I have pointed out that a single-queue design has none of
these failures, and you can prevent priority inversion by any number of well-known
schemes, which I leave it up to you to do a little basic research on. That you are
relying on a queueing mechanism which you assume either has properties no queueing system
can ever possess (e.g., infinite queue size) or fail to have a plan in place to address
what happens on queue overflow, and by hypothesizing the infinite-queue model is going to
be sufficient, you choose ONE implementation which is probably less than ideal in a number
of ways, because of something you saw in some NG, or found in a poorly-written 'man' page,
or heard about being in WikiPedia.

You have confused a requirements document with an implementation specification; you
apparently let your bad ideas about implementation techniques drive your requirements
document, which is a poor way to arrive at a design.

You have failed to address the problem of failure points, something I've been telling you
about for probably a week; instead you either select inherently unreliable mechamisms
(linux named pipes, which are reliable ONLY IF no process ever fails and the system can
never crash, and that's assuming they work correctly), reject known reliable mechanisms
(transacted database systems), think that certain APIs will magically posess properties
they have never had (e.g., pwrite), and I'm supposed to PAY ATTENTION to this blather?

When I start seeing a coherent approach to design, I'll start paying attention.
>
>> resources and maximum unused resources, but what does
>> maximum resource utilization and
>> minimum respons time have to do with the design? It
>> guarantees priority inversion,
>
>Yeah it sure does when you critique your misconception of my
>design instead of the design itself. I use the term
>PROCESSES and you read the term THREADS. Please read what I
>actually say, not what you expect that I will say or have
>said.
****
When you say something like "I am using a named pipe" and give a set of silly reasons,
when you start telling me about MS-DOS File I/O, when you spout nonsense about
interruptibility, when you systematically ignore the complex issues of workflow state
management and disaster recovery, I think I have seen what you are saying. Of course, the
fact that it is all nonsense anyway doesn't hurt my ability to see that it is not forming
a good design. Good designs have certain easily recognized properties. Since I've had to
do realtime programming, I know that the multiqueue approach is actually a very bad one. A
single queue with anti-priority-inversion (even a simple one such as not scheduling more
than K < N long computations, for N the number of service threads is an improvment over
the kludge you are proposing) would be a tremendous improvement, allowing more concurrency
and reducing latency, which I would think would be desirable goals.

The interesting thing is you can write SIMULATIONS of these models and show beyond any
shadow of a doubt that your proposed scheme will give some of the worst possible behavior,
but that would actually require doing actual measurments, and not immediately deducing it
from the Tarot cards or I Ching. Yet a simple closed-form queueing model will also show
this is a bad idea!

You want solid reasoning? Take a course in realtime scheduling issues! I do not plan to
teach one in this forum.
joe
****
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

First | Prev | Next | Last
Pages: 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system