IPC based on name pipe FIFO and transaction log file [Unix Programming]

Prev: LBW 0.1: Linux Binaries on Windows
Next: socket

From: David Schwartz on 1 Apr 2010 19:51

On Apr 1, 4:23 pm, "Peter Olcott" <NoS...(a)OCR4Screen.com> wrote:

> --Appends are not guaranteed atomic. So each writer would
> have to have
> --its own transaction log file or you'd need some separate
> mechanism to
> --lock them.

> You may be correct, but, if you are then two different
> editions of Advanced Programming in the Unix Environment
> would be incorrect:

> First Edition Chapter 3 Section 3.11 Atomic Operations page
> 60-61 Appending to a File
> "Unix provides an atomic way to do this operation if we set
> the O_APPEND flag when a file is opened."

> Second Edition Chapter 3 Section 3.11 Atomic Operations page
> 74 Appending to a File
> "Unix provides an atomic way to do this operation if we set
> the O_APPEND flag when a file is opened."

You are confusing two different notions of atomicity. Sorry I wasn't
clearer. A write to a file opened with O_APPEND is atomic in the sense
that the file position pointer will not move between the notional seek
and the write. So if two processes each append an "A", this can't
happen:

1) Process 1 seeks to the end.
2) Process 2 seeks to the end.
3) Process 1 writes an A.
4) Process 2 writes an A on top of the first A.

The net effect is only one 'A'. That can't happen.

However, the issue is whether they're atomic in the sense that the
write operation itself cannot be interrupted by another file-
modification operation.

The standards say: "If the O_APPEND flag of the file status flags is
set, the file offset shall be set to the end of the file prior to each
write and no intervening file modification operation shall occur
between changing the file offset and the write operation."

This seems to guarantee the former atomicity but not the latter. But
your implementation would require the latter. If there's anything that
guarantees what you need, I'm not aware of it. I had hashed this out
in the past but am unable to recall for sure the final resolution. I
believe it was that it is not formally guaranteed, but that the
guarantee provided by the standard would be all but useless without
it.

> --Or are you suggesting there be one transaction log file
> and one named
>
> Yes one single transaction log file.
>
> --pipe for each possible thread-to-thread set? If so, how
> will they be
> --established in the first place?

> Two total pipes, one in each direction.

Okay, so a message comes in over a pipe, how does it get to the right
thread -- the one that's waiting for that message?

> --It's hard to analyze a solution without knowing what
> problem it's
> --supposed to solve. ;)

> I am trying to convert my proprietary OCR software into a
> web application. Initially there will be multiple threads,
> one for each web request, and a single threaded process
> servicing these web requests. Eventually there may be
> multiple threads servicing these web requests.

Seems kind of silly to have a thread for each request that spends most
of its time just waiting for another program. You don't need a thread
to wait. Just assign a thread to do what needs to be done when you're
notified that the request is finished being processed by the OCR
software.

I'm saying, don't have one thread waiting to do X when X is possible
and one waiting to do Y when Y is possible and so on. First, this
wastes a lot of threads. Second, it forces a lot of context switches
to get the "right thread for the job" running.

Instead, have one thread that waits until anything is possible. When
something is possible, it wakes another thread to wait for the next
thing to be possible and it does X, Y, Z, or whatever it was just told
is now possible to do.

This results in far fewer context switches and better utilization of
CPU code and data caches. (Of course, if the web part is an
insignificant fraction of resource usage, it might not matter.)

DS

From: Scott Lurndal on 1 Apr 2010 20:09

David Schwartz <davids(a)webmaster.com> writes:
>On Apr 1, 9:09=A0am, "Peter Olcott" <NoS...(a)OCR4Screen.com> wrote:
>
>> The first process appends (O_APPEND flag) transaction
>> records to a transaction log file, and then writes to a
>> named pipe to inform the other process that a transaction is
>> ready for processing. The transaction log file contains all
>> of the details of the transaction as fixed length binary
>> records. Any reads of this file use pread().
>
>Appends are not guaranteed atomic. So each writer would have to have
>its own transaction log file or you'd need some separate mechanism to
>lock them.

A single write or pwrite call on a file with O_APPEND
is required by the SUS to ensure that the write is performed
atomically with respect to other writes to the same file which also have
the O_APPEND flag set. The order, of course, is not
guaranteed.

scott

From: Peter Olcott on 1 Apr 2010 20:24

"David Schwartz" <davids(a)webmaster.com> wrote in message
news:d83fe7cb-609f-4456-a4de-66eca05c211f(a)i25g2000yqm.googlegroups.com...
On Apr 1, 4:23 pm, "Peter Olcott" <NoS...(a)OCR4Screen.com>
wrote:

> --Appends are not guaranteed atomic. So each writer would
> have to have
> --its own transaction log file or you'd need some separate
> mechanism to
> --lock them.

> You may be correct, but, if you are then two different
> editions of Advanced Programming in the Unix Environment
> would be incorrect:

> First Edition Chapter 3 Section 3.11 Atomic Operations
> page
> 60-61 Appending to a File
> "Unix provides an atomic way to do this operation if we
> set
> the O_APPEND flag when a file is opened."

> Second Edition Chapter 3 Section 3.11 Atomic Operations
> page
> 74 Appending to a File
> "Unix provides an atomic way to do this operation if we
> set
> the O_APPEND flag when a file is opened."

--You are confusing two different notions of atomicity.
Sorry I wasn't
--clearer. A write to a file opened with O_APPEND is atomic
in the sense
--that the file position pointer will not move between the
notional seek
--and the write. So if two processes each append an "A",
this can't
--happen:

--1) Process 1 seeks to the end.
--2) Process 2 seeks to the end.
--3) Process 1 writes an A.
--4) Process 2 writes an A on top of the first A.

--The net effect is only one 'A'. That can't happen.

That may be all that I need..

--However, the issue is whether they're atomic in the sense
that the
--write operation itself cannot be interrupted by another
file-
--modification operation.

--The standards say: "If the O_APPEND flag of the file
status flags is
--set, the file offset shall be set to the end of the file
prior to each
--write and no intervening file modification operation shall
occur
--between changing the file offset and the write operation."

So the complete Append Operation can not be interrupted.

--This seems to guarantee the former atomicity but not the
latter. But

I don't see how this does not guarantee all of the atomicity
that I need. Could you propose a concrete example that meets
the standard and causes problems? It seems like it is saying
the entire append must complete before any other file
modifications take place.

--your implementation would require the latter. If there's
anything that
--guarantees what you need, I'm not aware of it. I had
hashed this out
--in the past but am unable to recall for sure the final
resolution. I
--believe it was that it is not formally guaranteed, but
that the
--guarantee provided by the standard would be all but
useless without
--it.

> --Or are you suggesting there be one transaction log file
> and one named
>
> Yes one single transaction log file.
>
> --pipe for each possible thread-to-thread set? If so, how
> will they be
> --established in the first place?

> Two total pipes, one in each direction.

--Okay, so a message comes in over a pipe, how does it get
to the right
--thread -- the one that's waiting for that message?

Initially (on the request) there will be only one thread on
the other end, and multiple thread on the sending end.
Eventually there my be multiple threads on the other end,
and it won't matter which one picks it up. The response may
be a little trickier, maybe the Thread-ID is passed through
in the request.

> --It's hard to analyze a solution without knowing what
> problem it's
> --supposed to solve. ;)

> I am trying to convert my proprietary OCR software into a
> web application. Initially there will be multiple threads,
> one for each web request, and a single threaded process
> servicing these web requests. Eventually there may be
> multiple threads servicing these web requests.

--Seems kind of silly to have a thread for each request that
spends most
--of its time just waiting for another program. You don't
need a thread
--to wait. Just assign a thread to do what needs to be done
when you're
--notified that the request is finished being processed by
the OCR
--software.

This is a fundamental part of the web server that I will be
using. Also I want the request to be acknowledged
immediately. Processing may take quite a while.

--I'm saying, don't have one thread waiting to do X when X
is possible
--and one waiting to do Y when Y is possible and so on.
First, this
--wastes a lot of threads. Second, it forces a lot of
context switches
--to get the "right thread for the job" running.

The web server queues up requests in order of arrival. I
think that this side has to be very responsive or the
connection might die. In any case I don't want the user to
wait until their request is acknowledged. I want at least
the acknowledgement to be as immediate as possible.

--Instead, have one thread that waits until anything is
possible. When
--something is possible, it wakes another thread to wait for
the next
--thing to be possible and it does X, Y, Z, or whatever it
was just told
--is now possible to do.

All this stuff is already implemented in the web server. I
merely must interface with pre-existing code.

It is possible that I could get 1000 requests at once, and
take several minutes to process all of them.

--This results in far fewer context switches and better
utilization of
--CPU code and data caches. (Of course, if the web part is
an
--insignificant fraction of resource usage, it might not
matter.)

Yes insignificant fraction, probably far less than 1%.

DS

From: Peter Olcott on 1 Apr 2010 20:49

"Scott Lurndal" <scott(a)slp53.sl.home> wrote in message
news:nBatn.1653$OC1.680(a)news.usenetserver.com...
> David Schwartz <davids(a)webmaster.com> writes:
>>On Apr 1, 9:09=A0am, "Peter Olcott"
>><NoS...(a)OCR4Screen.com> wrote:
>>
>>> The first process appends (O_APPEND flag) transaction
>>> records to a transaction log file, and then writes to a
>>> named pipe to inform the other process that a
>>> transaction is
>>> ready for processing. The transaction log file contains
>>> all
>>> of the details of the transaction as fixed length binary
>>> records. Any reads of this file use pread().
>>
>>Appends are not guaranteed atomic. So each writer would
>>have to have
>>its own transaction log file or you'd need some separate
>>mechanism to
>>lock them.
>
> A single write or pwrite call on a file with O_APPEND
> is required by the SUS to ensure that the write is
> performed
> atomically with respect to other writes to the same file
> which also have
> the O_APPEND flag set. The order, of course, is not
> guaranteed.
>
> scott

To what extent is the order not guaranteed?
I envision that my second process will need to write to the
same record that was just appended almost immediately. Could
this be an issue?

From: Ersek, Laszlo on 1 Apr 2010 21:16

On Thu, 1 Apr 2010, Peter Olcott wrote:

> I don't see how this does not guarantee all of the atomicity that I
> need. Could you propose a concrete example that meets the standard and
> causes problems? It seems like it is saying the entire append must
> complete before any other file modifications take place.

write() itself (with or without O_APPEND) is not required to write all
bytes at once, even to a regular file.

http://www.opengroup.org/onlinepubs/9699919799/functions/write.html

Consider a signal delivered to the thread, a file size limit reached, or
being temporarily out of space on the fs hosting the file; all after some
but not all bytes were written. (These would return -1 and set errno to
EINTR (without SA_RESTART), EFBIG, ENOSPC, respectively, if no data could
have been written before encountering the condition in question.) This
list is not exhaustive.

(Signal delivery is plausible -- suppose you submit a write request of 1G
bytes on a system with little buffer cache and a slow disk. If SSIZE_MAX
equals LONG_MAX, for example, the result of such a request is not even
implementation-defined.)

----v----

Write requests to a pipe or FIFO shall be handled in the same way as a
regular file with the following exceptions:

[...]

* Write requests of {PIPE_BUF} bytes or less shall not be interleaved with
data from other processes doing writes on the same pipe. [...]

* If the O_NONBLOCK flag is clear, a write request may cause the thread to
block, but on normal completion it shall return nbyte.

[...]

----^----

Both quoted guarantees (exclusion of interleaved writes, plus completeness
of writes) are exceptional behavior of pipes in relation to regular files.

Perhaps an interpretation request should be submitted: if write() returns
nbyte and O_APPEND was set, was the block written atomically then?

<http://www.kernel.org/doc/man-pages/online/pages/man2/write.2.html> does
say

----v----

If the file was open(2)ed with O_APPEND, the file offset is first set to
the end of the file before writing. The adjustment of the file offset and
the write operation are performed as an atomic step.

----^----

which seems to imply that the write operation itself is atomic. (... if it
returns "count", in my interpretation.)

lacos

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: LBW 0.1: Linux Binaries on Windows
Next: socket