From: Moi on
On Sat, 03 Apr 2010 21:25:52 -0500, Peter Olcott wrote:

> "Eric Sosman" <esosman(a)ieee-dot-org.invalid> wrote in message
> news:hp8flb$oa9$1(a)news.eternal-september.org...
>> On 4/3/2010 6:10 PM, Peter Olcott wrote:
>>> [...]
>>> So then it is not very good for event driven operations? What I am
>>> looking for is some sort of callback mechanism that can notify me when
>>> input is available.
>>
>> Most Unix things wait on "state" rather than on
>> "events."
>> This is usually viewed as a Good Thing, because (examples):
>>
>> - Suppose the event occurs a moment before anyone
>> waits
>> for it. Lacking infinite buffer space, the kernel
>> just
>> throws the event away -- and *then* somebody waits,
>> and
>> waits, and waits, because the train has already
>> left.
>>
>> - Suppose the event callback takes a little longer
>> than
>> expected (gets a page fault, say), and another event arrives
>> while the first is still being processed.
>> What
>> now? A second simultaneous callback? Buffer the
>> event
>> (in that infinite memory) and call again later?
>>
>> That said, event-based disciplines do have their
>> place,
>> mostly in real-time programming. But for "ordinary user-land
>> stuff," you'll probably find state friendlier than events.
>
> I will be receiving up to 100 web requests per second (that is the
> maximum capacity) and I want to begin processing them as soon as they
> arrive hopefully without polling for them. If I have to I will poll
> every 10 ms, in a thread that sleeps for 10 ms.

No. If there is nothing to read, your process is blocked.
Either in read() or in select() / poll.
There is no real difference; except for select/poll to allow you to
limit the time you are blocked to do other useful things in between the reads.
Read() blocks forever.

Blocking does not cost CPU. Your process is just stuck in a systemcall
that has not returned yet.
So it will cost you a thread. (which has nothing useful to do anyway)

HTH,
AvK

From: David Schwartz on
On Apr 4, 3:29 am, Moi <r...(a)invalid.address.org> wrote:

> No. If there is  nothing to read, your process is blocked.
> Either in read() or in select() / poll.
> There is no real difference; except for select/poll to allow you to
> limit the time you are blocked to do other useful things in between the reads.
> Read() blocks forever.

Well, also with 'select' or 'poll', you can wait for more than one
file descriptor to become ready.

> Blocking does not cost CPU. Your process is just stuck in a systemcall
> that has not returned yet.

Well, it does have CPU overhead, it just scales with the number of
events you wait for rather than with how long you wait.

For example, if you call 'poll' to check for readiness of, say, 100
file descriptors, the kernel will have to put your process on 100 wait
queues, and then when any one of those descriptors comes ready (or the
time you specified runs out), it will have to remove you from all 100
wait queues. This can cost significant CPU.

DS
From: David Given on
On 04/04/10 03:42, Peter Olcott wrote:
[...]
>> {
>> set_up_file_descriptors();
>> for (;;)
>> {
>> int fd = wait_for_file_descriptor_to_change_state();
>> do_something_with(fd);
>> }
>> }
>
> That look like it would eat up too much CPU time fro my CPU
> intensive process.

I think there's a misunderstanding in how this works ---
wait_for_file_descriptor_to_change_state() *blocks* until a file
descriptor changes state. That is, it uses no CPU whatsoever. This isn't
a busy loop, no polling is done.

(In general, this approach is the *most* efficient way of handling this
--- this is the mechanism that the very fast webservers like thttpd use.)

[...]
> Perhaps I could wait for the
> file size to grow?

I think you're going to have to go into more details about what your
problem actually is. This is a pipe; it has no file size. What are you
trying to do?

--
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────

│ "In the beginning was the word.
│ And the word was: Content-type: text/plain" --- Unknown sage
From: Jens Thoms Toerring on
In comp.unix.programmer Peter Olcott <NoSpam(a)ocr4screen.com> wrote:

> "David Given" <dg(a)cowlark.com> wrote in message
> news:hp8gt5$38u$1(a)news.eternal-september.org...
> > On 03/04/10 23:10, Peter Olcott wrote:
> > [...]
> >> So then it is not very good for event driven operations?
> >> What I am looking for is some sort of callback mechanism
> >> that can notify me when input is available.
> >
> > Well, you can tell the kernel to send you a signal when the file
> > descriptor changes state --- see fcntl(F_SETFL, O_ASYNC).
> > Normally this is SIGIO but some Unixes (like Linux) allow you to
> > specify any signal.
> >
> > But, as I said, in my experience it's not very reliable, it's
> > certainly not portable, and it's not actually terribly useful
> > given how little you can do in a signal handler.
> >
> > The normal approach for doing this sort of thing is to structure
> > your program like this:
> >
> > {
> > set_up_file_descriptors();
> > for (;;)
> > {
> > int fd = wait_for_file_descriptor_to_change_state();
> > do_something_with(fd);
> > }
> > }

> That look like it would eat up too much CPU time fro my CPU
> intensive process.

What would eat up CPU time with that approach? If you use
poll() or select() for waiting for the file descriptor to
change then the thread will use nearly no CPU time at all
while waiting - all that's needed is to do the poll/select
system call. While there's nothing to read your thread will
sleep, doing absolutely nothing. So the response time is as
short as the machine can manage while the CPU consumption is
nearly zero. Perhaps the name of the poll() function is
giving you a wrong idea - it doesn't poll in the sense that
it would try to check things out again and again in a thight
loop, instead the kernel will put your process to sleep and
only wake it up again when the file becomes readable (unless
you also requested a timeout). And the kernel doesn't have to
check for this repeatedly since it also manages all writes to
the file, so when it does a write to a file it just has to
check if there are any processes waiting for the file to be-
come readable and then mark these processes for rescheduling
at the earliest possible moment.

> > This gives you the event-driven model you're looking for, but in
> > a much more controlled manner.

Perhaps changing the name of the function in the example code
from wait_for_file_descriptor_to_change_state() to something
like wait_for_file_event() would make it even clearer.

And if you have already some kind of event-driven program
then you could make a thread out of the above function, where
you have the "do_something_with(fd);" part send an event to
the event loop. The CPU cost of that should be minimal, com-
bined with the fastest possible response time.

> A general approach that I know would work would be to do the
> same sort of thing in a thread that sleeps for 10 ms on every
> iteration. That way I get 10 ms response and eat very little CPU.

Well, you could make the file non-blocking and then do a loop
in which you try to read and then sleep for the 10 ms, using
e.g. nanosleep(). Or instead of making the file non-blocking
you also could use poll/select for checking, with the timeout
set so that the function returns immediately. But that will
probably use more CPU time than the approach above since you
are now actually polling in the classical sense of the word
instead of just having the kernel notify you once something
has changed about the file.

> I am guessing that pthreads can sleep for 10 ms.

In principle yes, but you shouldn't rely on it being exactly
10 ms. If you go to sleep you can only specify a lower bound,
it may take longer if other tasks with higher priority required
the CPU. And the time resolution depends on how your machine
is configured, nowadays it's typically around a milli-second
but be prepared for worse.

> Perhaps I could wait for the file size to grow?

Sound to me like the most costly way to do it...

Regards, Jens
--
\ Jens Thoms Toerring ___ jt(a)toerring.de
\__________________________ http://toerring.de
From: Peter Olcott on

"Mark Hobley" <markhobley(a)hotpop.donottypethisbit.com> wrote
in message news:v4nl87-ula.ln1(a)neptune.markhobley.yi.org...
> In comp.unix.programmer Peter Olcott
> <NoSpam(a)ocr4screen.com> wrote:
>> I will be receiving up to 100 web requests per second
>> (that
>> is the maximum capacity) and I want to begin processing
>> them
>> as soon as they arrive hopefully without polling for
>> them.
>
> That will happen. If 100 web requests come down the pipe,
> the receiving process
> will get them.
>
> It is only when no requests come down the pipe that the
> receiving process will
> have to wait for a request to come in. This is no big
> deal.
>
> Mark.
>
> --
> Mark Hobley
> Linux User: #370818 http://markhobley.yi.org/
>

So it is in an infinite loop eating up all of the CPU time
only when there are no requests to process? I don't think
that I want this either because I will have two priorities
of requests. If there are no high priority requests I want
it to begin working on the low priority requests.