Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Hector Santos on 11 Apr 2010 17:45

Peter Olcott wrote:

>> I don't see where anybody who has a clue has proposed
>> either of these
>> -- both of them are fairly poor. The idea of four separate
>> queues is
>> pointless and stupid. As both Joe and I have pointed out,
>> what you
>> want is a priority queue. With four separate queues, race
>> conditions
>> are almost inevitable -- for example, you check the
>> top-priority
>> queue for a job first and find it empty, so you check the
>> second
>> priority and then the third, and (just for the sake of
>> argument,
>> we'll assume you find a job there and start it -- but
>> didn't notice
>> that while you were doing the other checking, a job was
>> inserted into
>> the top priority queue, so you end up doing a
>> lower-priority job
>> ahead of a higher priority one.
>
> One of four different processes only checks its own single
> queue.
>
> I think that it only seems stupid because you did not
> understand what I am saying. That may be my fault I may not
> have explained it well enough.

Everyone doesn't understand you. That IS your fault and your
explanations gets worst each time.

> Alternative (a) There are four processes with four queues
> one for each process. These processes only care about
> executing the jobs from their own queue. They don't care
> about the jobs in any other queue. The high priority process
> is given a relative process priority that equates to 80% of
> the CPU time of these four processes. The remaining three
> processes get about 7% each. This might degrade the
> performance of the high priority jobs more than the next
> alternative.

So how do your HTTP request get delegated? Four separate IP address,
sub domains?

free.peter.com
1penny.peter.com
nickle.peter.com
peso.peter.com

What happens when they cross domain attempts occur?

> Alternative (b) each of the low priority jobs checks to see
> if a high priority job is in the queue or is notified by a
> signal that a high priority job is waiting. If a high
> priority job is waiting then each of these low priority jobs
> immediately sleeps for a fixed duration. As soon as they
> wake up these jobs check to see if they should go back to
> sleep or wake up.

So we have a store and forward model again!

> These processes could even simply poll a shared memory
> location that contains the number of high priority jobs
> currently in the queue.

Polling!? Oh my gawd - he used the P word!

> From what the hardware guys have
> told me memory writes and reads can not possibly garble each
> other.

Right, two different memory locations can not possible overflow each
other.

> Because of this, the shared memory location would not
> even need to be locked. One writer and three readers.

Right, append to bottom, read from top. As long as you never had to
delete the queue, you should be find.

But you have 4 possible writers. Or do you? Yes, hmmm? Yesterday you
did, today, you don't, or is that vice versa?

> I already figured out a way around that. Everyone must have
> their own user account that must be created by a live human.
> All users are always authenticated against this user
> account. I don't see any loopholes in this on single form of
> protection.

Cross domains? A nickel.peter.com user tries to get free.peter.com
stuff. A peso.peter.com tries to get 1penny.peter.com stuff.

IDEA: Require that users must have STATIC IP addresses. No dynamic IP
customers. That way you can delegate and firewall your users to one
of the four Web servers.

--
HLS

From: Hector Santos on 11 Apr 2010 17:46

Joseph M. Newcomer wrote:

> See below...
> On Sat, 10 Apr 2010 15:20:51 -0400, Hector Santos <sant9442(a)nospam.gmail.com> wrote:
>
>> Peter Olcott wrote:
>>
>>
>>> Also I will go ahead and use a transacted database for all
>>> persistent storage. SQLite is supposed to be good for up to
>>> 500 transactions per second, and I only need 100. I just
>>> have to make sure that the reliability caveats that SQLite
>>> mentions are covered. I am guessing that SQLite might even
>>> be smart enough to cover the one efficiency aspect that I
>>> was concerned about. It may be able to directly seek using
>>> record number and record size to derive the file byte
>>> offset. In any case I won't worry about this either.
>> Take a step back into the balcony.
>>
>> 1) 100 TPS means you a *total work load* is 10 ms per request
>>
>> You will not be able to get this done in 10 ms. If you don't believe
>> this, than you are in for a big heart breaking surprise.
>>
>> 2) SQLITE locks the data file hence the database during updates, so
>> all new incoming request will be BLOCKED during updates from other
>> already request in the process. That right there gives you more
>> delays, contentions issues.
>>
>> 3) SQLITE is a SQL database system. Even though behind the scenes it
>> uses a ISAM/BTREE system, YOU don't have access to this. You might be
>> able to write some hooks using their virtual access API, but I
>> sincerely doubt working at the record and BYTE level is prohibited and
>> YOU would BE STUPID to do so. You might was well use a pure ISAM file
>> for this. Get another database management API for this. SQLITE3 is
>> not what you want if you need file level access.
>>
>> Your problem is that you are stuck with a 100 TPS which is FAR too
>> much for you.
>>
>> 100 TPS is 6000 per minute, 36,000 per hour, 288,000 per 8 hour work
>> day! You are OUT of your mind if you think your 10 year outdated
>> low-desired OCR idea is that desirable today.
> ****
> This is the problem with many business plans: overoptimistic revenue projections.

He now got it down to 50 TPS, 3000 per minute, 18,000 per hour,
144,000 per work day after he was told that he can't do his 10 ms OCR
in 0 ms OVERHEAD :)

>
> Frankly, I think the whole thing is being overengineered to handle the anticipated flood
> of usage that will never materialize, or will not be sustained independent of the response
> time. We did not even worry about performance of our server manager (which used
> transacted databses) until a customer came to use with a specific request they made as a
> condition of sale, specifically, being able to handle 400 tpm. Once I demonstrated that
> when I saturated my then-10-base-T network and could handle 1300tpm, we had a sale; also,
> the first need to actually have measured performance (in the past, the transcted database
> overhead was not even noticeable, and it was fast enough for practical server farms).
> Perhaps I should redo the experiment now that the entire office network backbone is 1GB
> copper. But it doesn't matter.

Exactly. And remember he is basing all this on serialized equal work
loading - one request right after another. No concurrency or skewed
load distribution.

--
HLS

From: Peter Olcott on 11 Apr 2010 17:57

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:%23MD7TBc2KHA.1016(a)TK2MSFTNGP02.phx.gbl...
> Peter Olcott wrote:
>
>>> I don't see where anybody who has a clue has proposed
>>> either of these
>>> -- both of them are fairly poor. The idea of four
>>> separate queues is
>>> pointless and stupid. As both Joe and I have pointed
>>> out, what you
>>> want is a priority queue. With four separate queues,
>>> race conditions
>>> are almost inevitable -- for example, you check the
>>> top-priority
>>> queue for a job first and find it empty, so you check
>>> the second
>>> priority and then the third, and (just for the sake of
>>> argument,
>>> we'll assume you find a job there and start it -- but
>>> didn't notice
>>> that while you were doing the other checking, a job was
>>> inserted into
>>> the top priority queue, so you end up doing a
>>> lower-priority job
>>> ahead of a higher priority one.
>>
>> One of four different processes only checks its own
>> single queue.
>>
>> I think that it only seems stupid because you did not
>> understand what I am saying. That may be my fault I may
>> not have explained it well enough.
>
>
> Everyone doesn't understand you. That IS your fault and
> your explanations gets worst each time.
>
>> Alternative (a) There are four processes with four queues
>> one for each process. These processes only care about
>> executing the jobs from their own queue. They don't care
>> about the jobs in any other queue. The high priority
>> process is given a relative process priority that equates
>> to 80% of the CPU time of these four processes. The
>> remaining three processes get about 7% each. This might
>> degrade the performance of the high priority jobs more
>> than the next alternative.
>
>
> So how do your HTTP request get delegated? Four separate
> IP address, sub domains?
>
> free.peter.com
> 1penny.peter.com
> nickle.peter.com
> peso.peter.com
>
> What happens when they cross domain attempts occur?

I would not be using the complex design that you are
referring to.
One domain, one web server, four OCR processes.

>
>
>> Alternative (b) each of the low priority jobs checks to
>> see if a high priority job is in the queue or is notified
>> by a signal that a high priority job is waiting. If a
>> high priority job is waiting then each of these low
>> priority jobs immediately sleeps for a fixed duration. As
>> soon as they wake up these jobs check to see if they
>> should go back to sleep or wake up.
>
>
> So we have a store and forward model again!
>
>> These processes could even simply poll a shared memory
>> location that contains the number of high priority jobs
>> currently in the queue.
>
>
> Polling!? Oh my gawd - he used the P word!
>
>> From what the hardware guys have told me memory writes
>> and reads can not possibly garble each other.
>
>
> Right, two different memory locations can not possible
> overflow each other.

One writer of a single unsigned 32-bit integer at a fixed
shared memory location and three readers.
if (NumberOfHighPriorityJobs != 0)
nanosleep(20);

>
>> Because of this, the shared memory location would not
>> even need to be locked. One writer and three readers.
>
>
> Right, append to bottom, read from top. As long as you
> never had to delete the queue, you should be find.
>
> But you have 4 possible writers. Or do you? Yes, hmmm?
> Yesterday you did, today, you don't, or is that vice
> versa?
>
>> I already figured out a way around that. Everyone must
>> have their own user account that must be created by a
>> live human. All users are always authenticated against
>> this user account. I don't see any loopholes in this on
>> single form of protection.
>
>
> Cross domains? A nickel.peter.com user tries to get
> free.peter.com stuff. A peso.peter.com tries to get
> 1penny.peter.com stuff.
>
> IDEA: Require that users must have STATIC IP addresses.
> No dynamic IP customers. That way you can delegate and
> firewall your users to one of the four Web servers.
>
> --
> HLS

From: Peter Olcott on 11 Apr 2010 18:05

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:O9Ba8Bc2KHA.1016(a)TK2MSFTNGP02.phx.gbl...
> Joseph M. Newcomer wrote:
>
>> See below...
>> On Sat, 10 Apr 2010 15:20:51 -0400, Hector Santos
>> <sant9442(a)nospam.gmail.com> wrote:
>>
>>> Peter Olcott wrote:
>>>
>>>
>>>> Also I will go ahead and use a transacted database for
>>>> all persistent storage. SQLite is supposed to be good
>>>> for up to 500 transactions per second, and I only need
>>>> 100. I just have to make sure that the reliability
>>>> caveats that SQLite mentions are covered. I am
>>>> guessing that SQLite might even be smart enough to
>>>> cover the one efficiency aspect that I was concerned
>>>> about. It may be able to directly seek using record
>>>> number and record size to derive the file byte offset.
>>>> In any case I won't worry about this either.
>>> Take a step back into the balcony.
>>>
>>> 1) 100 TPS means you a *total work load* is 10 ms per
>>> request
>>>
>>> You will not be able to get this done in 10 ms. If you
>>> don't believe this, than you are in for a big heart
>>> breaking surprise.
>>>
>>> 2) SQLITE locks the data file hence the database during
>>> updates, so all new incoming request will be BLOCKED
>>> during updates from other already request in the
>>> process. That right there gives you more delays,
>>> contentions issues.
>>>
>>> 3) SQLITE is a SQL database system. Even though behind
>>> the scenes it uses a ISAM/BTREE system, YOU don't have
>>> access to this. You might be able to write some hooks
>>> using their virtual access API, but I sincerely doubt
>>> working at the record and BYTE level is prohibited and
>>> YOU would BE STUPID to do so. You might was well use a
>>> pure ISAM file
>>> for this. Get another database management API for this.
>>> SQLITE3 is not what you want if you need file level
>>> access.
>>>
>>> Your problem is that you are stuck with a 100 TPS which
>>> is FAR too much for you.
>>>
>>> 100 TPS is 6000 per minute, 36,000 per hour, 288,000 per
>>> 8 hour work day! You are OUT of your mind if you
>>> think your 10 year outdated low-desired OCR idea is that
>>> desirable today.
>> ****
>> This is the problem with many business plans:
>> overoptimistic revenue projections.
>
>
> He now got it down to 50 TPS, 3000 per minute, 18,000 per
> hour, 144,000 per work day after he was told that he can't
> do his 10 ms OCR in 0 ms OVERHEAD :)

Bullshit again. I already said that I estimated the overhead
to not exceed 10 ms. You stupidly read this to mean that I
said that the overhead will take 0 ms. I did say that my
goal is to get as close to zero time as possible, and this
might still be possible because of hypertheading. Worst
case scenario is that it will double my processing time.

If you weren't so damn rude I would not be so harsh.
Although I could have written what I said more clearly, it
is not my fault for you getting it wrong. All you had to do
to get it right is to read exactly what I said.

>
>>
>> Frankly, I think the whole thing is being overengineered
>> to handle the anticipated flood
>> of usage that will never materialize, or will not be
>> sustained independent of the response
>> time. We did not even worry about performance of our
>> server manager (which used
>> transacted databses) until a customer came to use with a
>> specific request they made as a
>> condition of sale, specifically, being able to handle 400
>> tpm. Once I demonstrated that
>> when I saturated my then-10-base-T network and could
>> handle 1300tpm, we had a sale; also,
>> the first need to actually have measured performance (in
>> the past, the transcted database
>> overhead was not even noticeable, and it was fast enough
>> for practical server farms).
>> Perhaps I should redo the experiment now that the entire
>> office network backbone is 1GB
>> copper. But it doesn't matter.
>
>
> Exactly. And remember he is basing all this on serialized
> equal work loading - one request right after another. No
> concurrency or skewed load distribution.
>
> --
> HLS

From: Hector Santos on 11 Apr 2010 19:18

Peter Olcott wrote:

>> So how do your HTTP request get delegated? Four separate
>> IP address, sub domains?
>>
>> free.peter.com
>> 1penny.peter.com
>> nickle.peter.com
>> peso.peter.com
>>
>> What happens when they cross domain attempts occur?
>
> I would not be using the complex design that you are
> referring to. One domain, one web server, four OCR processes.

So you back to a Many to One Fifo queue. And what happens with the
HTTP responses?

>>
>> Right, two different memory locations can not possible
>> overflow each other.
>
> One writer of a single unsigned 32-bit integer at a fixed
> shared memory location and three readers.
> if (NumberOfHighPriorityJobs != 0)
> nanosleep(20);

By the web server needs to do a +1 and one of the OCR has to do a -1.

No conflicts, no reader/locker locks? No Interlock increments and
decrements?

--
HLS

First | Prev | Next | Last
Pages: 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system