From: Geoff on
On Sat, 20 Mar 2010 09:52:33 -0500, "Peter Olcott"
<NoSpam(a)OCR4Screen.com> wrote:

>Maximum total processing time is 1/10 second for a whole
>page of text. My initial implementation (for testing
>purposes) may simply refuse larger requests. The final
>implementation will place large requests in a separate lower
>priority queue.
>

Your "memory bandwidth intensive" requirement is the bottleneck to
multithreading or multiprocessing. If your big memory chunk is
read-only, your problem with the DFA is that it lacks locality of
reference to that data. You end up hitting the RAM instead of being
able to utilize the data in the CPU caches. Multiple threads end up
contending with each other for access to RAM memory, hence the
slowdown. Compute-intensive applications benefit from multi-threading
by being able to stay off the RAM bus and utilize the caches in each
core.

If a single thread can only do 10 pages per second then that will be
the maximum steady-state throughput of your server. You cannot avoid a
queue for the service. The problem is that web servers can typically
service 10,000 pages per second or more.

A straight FIFO queue would probably be the proper method of service,
one "page" at a time per customer, the multi-page customers would be
queued one page at a time and the intermix of sessions would allow
each customer to experience the same delays at peak times.

Your application as presently specified is essentially asking
customers to share a 10 page per second scanner. Definitely not
scalable as a web application but possibly very useful as an office
appliance or a black box with a TCP/IP interface in an office
environment.
From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:urQtUiGyKHA.2012(a)TK2MSFTNGP04.phx.gbl...
> Peter Olcott wrote:
>
>> I am thinking that the web server part will spawn a
>> thread for each request, and append this request at the
>> end of a FIFO queue, then this thread dies. My OCR
>> process would read requests from the head of the queue,
>> and process them one at a time in a single thread.
>
> Terrible!
>
> I would like to hear what the client will be doing waiting
> a response where each one will have a linear incremental
> delay factor as the queue builds.
>
> --
> HLS

Each request takes a maximum of 100 ms. When I reach an
average of more than one request per second of the 8 hour
weekday, I will get another server. For scalability I will
eventually have clusters of servers.


From: Peter Olcott on

"Geoff" <geoff(a)invalid.invalid> wrote in message
news:nq9aq51ofu8gpovunu40jf0dq6s8fhd5ps(a)4ax.com...
> On Sat, 20 Mar 2010 09:52:33 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>Maximum total processing time is 1/10 second for a whole
>>page of text. My initial implementation (for testing
>>purposes) may simply refuse larger requests. The final
>>implementation will place large requests in a separate
>>lower
>>priority queue.
>>
>
> Your "memory bandwidth intensive" requirement is the
> bottleneck to
> multithreading or multiprocessing. If your big memory
> chunk is
> read-only, your problem with the DFA is that it lacks
> locality of
> reference to that data. You end up hitting the RAM instead
> of being
> able to utilize the data in the CPU caches. Multiple
> threads end up
> contending with each other for access to RAM memory, hence
> the
> slowdown. Compute-intensive applications benefit from
> multi-threading
> by being able to stay off the RAM bus and utilize the
> caches in each
> core.

Yes that is what I was saying.

>
> If a single thread can only do 10 pages per second then
> that will be
> the maximum steady-state throughput of your server. You
> cannot avoid a
> queue for the service. The problem is that web servers can
> typically
> service 10,000 pages per second or more.
>
> A straight FIFO queue would probably be the proper method
> of service,
> one "page" at a time per customer, the multi-page
> customers would be
> queued one page at a time and the intermix of sessions
> would allow
> each customer to experience the same delays at peak times.
>
> Your application as presently specified is essentially
> asking
> customers to share a 10 page per second scanner.
> Definitely not
> scalable as a web application but possibly very useful as
> an office
> appliance or a black box with a TCP/IP interface in an
> office
> environment.

My business model provides for adding another server
whenever my average peak period (daytime weekday) load rises
above one request per second.


From: Hector Santos on
Peter Olcott wrote:

>>> My OCR process has the following requirements:
>>> (1) Must be a single thread on a machine dedicated to OCR
>>> processing.
>> ****
>> That's where Hecor & I are questioning the basic
>> assumption of single-threadedeness. He
>> says that mongoose requires a multithread-capable
>> algorithm, Since your DFA is static
>> once loaded, it requires no locking to access it. So why
>> is your code not
>> multithread-readly already?
>
> Mongoose will use its multiple threads to append to the end
> of a single FIFO queue, The OCR will read from the head of
> this queue.


We understand that is what you want to do. But it doesn't answer the
question - WHY? Why does the 2nd request have to WAIT twice as long,
the 3rd request, three times as long, and so on?

Look, if you don't want to SCALE UP - then SCALE OUT, get 3-4 machines
each dedicated to your SINGLE THREAD PROCESS and have the MONGOOSE
thread connect to each.

> If I get more than an average of one request per second, I
> will get another server. My process does not improve with
> my multiple cores, it does get faster with faster memory.
> 800% faster memory at the same core speed provided an 800%
> increase in speed. Two processes on a quad core resulted in
> half the speed for each.


Again, this has nothing to do with your machine, it has to do with the
poor design of your software - it was designed for a DOS, once process
at a time, SINGLE CPU machine!

--
HLS
From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:uZeQMvGyKHA.2012(a)TK2MSFTNGP04.phx.gbl...
> Peter Olcott wrote:
>
>>>> My OCR process has the following requirements:
>>>> (1) Must be a single thread on a machine dedicated to
>>>> OCR
>>>> processing.
>>> ****
>>> That's where Hecor & I are questioning the basic
>>> assumption of single-threadedeness. He
>>> says that mongoose requires a multithread-capable
>>> algorithm, Since your DFA is static
>>> once loaded, it requires no locking to access it. So
>>> why is your code not
>>> multithread-readly already?
>>
>> Mongoose will use its multiple threads to append to the
>> end of a single FIFO queue, The OCR will read from the
>> head of this queue.
>
>
> We understand that is what you want to do. But it doesn't
> answer the question - WHY? Why does the 2nd request have
> to WAIT twice as long, the 3rd request, three times as
> long, and so on?

Geoff has explained this better than I have.
When my average peak period (Daytime M-F) request rate rises
above 1/10 of my capacity, I get another server. In the
ideal case this extra server form a cluster of N servers.
Also in the ideal case the cluster of servers will be within
some proximity of the customer, such as East Coast, West
Cost, Central and Europe.

>
> Look, if you don't want to SCALE UP - then SCALE OUT, get
> 3-4 machines each dedicated to your SINGLE THREAD PROCESS
> and have the MONGOOSE thread connect to each.
>
>> If I get more than an average of one request per second,
>> I will get another server. My process does not improve
>> with my multiple cores, it does get faster with faster
>> memory. 800% faster memory at the same core speed
>> provided an 800% increase in speed. Two processes on a
>> quad core resulted in half the speed for each.
>
>
> Again, this has nothing to do with your machine, it has to
> do with the poor design of your software - it was designed
> for a DOS, once process at a time, SINGLE CPU machine!
>
> --
> HLS