From: Danmath on 11 Aug 2010 20:31 On 11 ago, 04:42, David Schwartz <dav...(a)webmaster.com> wrote: > That won't solve your problem, since another application can open the > file for write a split second after you open it. There is only one application reading from the input directory as each application has it's own input directory. This is a batch process reading always from the same directory. What you say could happen in a case of open/write/close/open/write/close, which I'm not taking into account, so I'm not particularly worried about this case. Although I did mention in my original post that I was also interested in knowing how to keep other applications away once I opened a file: I quote myself: "...although I would be interested in knowing how to keep other applications (not programmed by me) from opening the file once I opened it" Spilling out a little bit more of information.... the current process implements this by reading the last modification time, it waits for a few seconds, then it checks the last modification date again. Of course there is a validation done when the file is opened for processing that checks fopen() doesn't return error, in which case it carries on with the next file, after logging. Now, if I could get this fopen() call to return error if the file is being written to, or replace this call for whatever mechanism for file opening would do this, then the modification time check could be wiped of the program. It wouldn't fix the OWCOWC problem, but the current version doesn't either. I just don't like this modification time checking. If I could open the file knowing it will return error if some other application is no writing to it already, that would be good enough, being able to keep it from being opened in write mode by other applications would be great. Talking about modification times, when are they set? every times data gets written even without closing the file? What if the writing program writing the file only writes every 1 minute.... I don't think the current algorithm is reliable if writes can be that sporadic, and waiting for larger amounts of time is a waist of time. By the way, currently the process use fread and fwrite to access files. If you where to replace fopen() by another function like open(). How could you instance a valid FILE structure associated with the new file descriptor so that the rest of the program does't have to be changed to read() and write(), in a secure way. I agree with you that the only totally fool proof solution requires cooperation if exclusive file opening is not possible. It's just that this wood require modifications of various processes and file transmission tools in use, and I'm not saying that it wouldn't be worth it, I's just that I know what happens in this company when I make suggestions that can make better something that isn't causing deaths... "yeah...mmm.. we will look into it...mmm...".
From: Gordon Burditt on 11 Aug 2010 23:24 >There is only one application reading from the input directory as each >application has it's own input directory. This is a batch process I strongly recommend that the application filter out the files ".", "..", directories, special files, sockets, FIFOs, and anything matching "*.core" as candidates for input files. While you're at it, you could filter out "*.tmp" as well. >reading always from the same directory. What you say could happen in a >case of open/write/close/open/write/close, which I'm not taking into >account, so I'm not particularly worried about this case. Although I >did mention in my original post that I was also interested in knowing >how to keep other applications away once I opened a file: > >I quote myself: > >"...although I would be interested in knowing how to keep other >applications (not programmed by me) from opening the file once I >opened it" > >Spilling out a little bit more of information.... the current process >implements this by reading the last modification time, it waits for a >few seconds, then it checks the last modification date again. Of Assuming the files are being transferred in with FTP, this isn't enough. It is easy for a network hiccup, like one dropped packet, to cause the modification time to get a few seconds old. It could get a minute old before the TCP connection starts giving errors on either end. Question: with your method of file transfer, imagine a file is halfway transferred, then the network cable is cut. Does the partial file get left there indefinitely, or does the (FTP, perhaps) daemon eventually detect that the transfer has failed and *DELETE* the partial incoming file? If not, can it be made to do so? How quickly do failed partial incoming files vanish? That's a ballpark figure for any timestamp age threshholds. >course there is a validation done when the file is opened for >processing that checks fopen() doesn't return error, in which case it >carries on with the next file, after logging. Now, if I could get this >fopen() call to return error if the file is being written to, or >replace this call for whatever mechanism for file opening would do >this, then the modification time check could be wiped of the program. "the file is being written to" is not the condition you are looking for, especially if the network is slower than the disk. It will "flicker" on and off while the file is being transferred. >It wouldn't fix the OWCOWC problem, but the current version doesn't >either. I just don't like this modification time checking. If I could >open the file knowing it will return error if some other application >is no writing to it already, You want it to return error if some other application is *NOT* writing to it already? >that would be good enough, being able to >keep it from being opened in write mode by other applications would be >great. >Talking about modification times, when are they set? every times data >gets written even without closing the file? Yes. That's write() on the receiving end. >What if the writing >program writing the file only writes every 1 minute.... Then the modification time can easily get to be a minute old (watch the modification time on a log file sometime). With a network file transfer, this can easily happen before the sending process gets an error, if it ever gets an error. The network congestion could pass and the file transfer could eventually be completed, well after your process read what it thought was a complete file (but wasn't). If the sending process inserts long pauses between sending data (perhaps it's reading data from something really slow, like a card reader manually loaded by a human operator who has to order the correct card deck from overseas, after getting budget approval to continue, and the operator has to be replaced often because they keep dying of boredom), the modification date could get really, really, really old, like days, months, years, or decades while the file is still open for write. >I don't think >the current algorithm is reliable if writes can be that sporadic, and >waiting for larger amounts of time is a waist of time. If it's unreliable, then waiting for longer amounts of time is not wasteful. >By the way, currently the process use fread and fwrite to access >files. If you where to replace fopen() by another function like >open(). How could you instance a valid FILE structure associated with >the new file descriptor so that the rest of the program does't have to >be changed to read() and write(), in a secure way. Look up fdopen(). The point of this would most likely be to call open() with various exotic flags, then proceed with the file copy using stdio functions if the open succeeded. You can also go the other way with fileno() to get the underlying file descriptor number if, say, you want to put locks on it after fopen()ing it. >I agree with you that the only totally fool proof solution requires >cooperation if exclusive file opening is not possible. It's just that >this wood require modifications of various processes and file >transmission tools in use, and I'm not saying that it wouldn't be >worth it, I's just that I know what happens in this company when I >make suggestions that can make better something that isn't causing >deaths... "yeah...mmm.. we will look into it...mmm...". *What* file transfer tools are in use? I wonder if this could be handled by just modifying the FTP daemon *on the receiving end* (the receiving machine, only) to put exclusive locks on files while they are being transferred, *regardless* of what's on the other end of the connection. It could be as simple as adding O_EXLOCK to the open flags for opening the received file. Remember that only the receiving machine has to support that. I still think the transfer-and-rename approach has a lot to be said for it. That could also include initially transferring the file into a subdirectory, then renaming it to the top-level directory. Another approach, used by things like print spoolers and UUCP, is to transfer one or more data files, then transfer a "job" file which names the data files to use and what to do with them. The "job" file doesn't get created until the associated data files have been transferred successfully. The "job" file also tends to be very short (fits in one packet, contains a few lines mostly consisting of filename(s) ).
From: Rainer Weikusat on 12 Aug 2010 10:32 Danmath <danmath06(a)gmail.com> writes: > On 11 ago, 04:42, David Schwartz <dav...(a)webmaster.com> wrote: >> That won't solve your problem, since another application can open the >> file for write a split second after you open it. > > There is only one application reading from the input directory as each > application has it's own input directory. This is a batch process > reading always from the same directory. What you say could happen in a > case of open/write/close/open/write/close, which I'm not taking into > account, so I'm not particularly worried about this case. Although I > did mention in my original post that I was also interested in knowing > how to keep other applications away once I opened a file: > > I quote myself: > > "...although I would be interested in knowing how to keep other > applications (not programmed by me) from opening the file once I > opened it" > > Spilling out a little bit more of information.... the current process > implements this by reading the last modification time, it waits for a > few seconds, then it checks the last modification date again. Of > course there is a validation done when the file is opened for > processing that checks fopen() doesn't return error, in which case it > carries on with the next file, after logging. Now, if I could get this > fopen() call to return error if the file is being written to, or > replace this call for whatever mechanism for file opening would do > this, then the modification time check could be wiped of the > program. I assume that you are targetting Linux since you are using it for posting. Have you considered taking really drastic measures such as 'consulting the reference documentation'? This can be done by trying to acquire a write lease on this file and the detailed description how to do that is in the fcntl manpage. Exceprt: F_WRLCK Take out a write lease. This will cause the caller to be notified when the file is opened for reading or writing or is truncated. A write lease may be placed on a file only if there are no other open file descriptors for the file. #define _GNU_SOURCE #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> static int die(char const *msg) { perror(msg); exit(1); return -1; } int main(void) { pid_t pid; int fd, rc; fd = open("file", O_RDWR | O_CREAT, 0666); fd != -1 || die("open/ create"); pid = fork(); pid != -1 || die("fork"); if (pid == 0) { close(fd); fd = open("file", O_RDONLY | O_NONBLOCK, 0); fd != -1 || die("open/ read"); while (fcntl(fd, F_SETLEASE, F_RDLCK) == -1) { perror("fnctl"); sleep(1); } write(1, "I kid you not!", sizeof("I kid you not!") - 1); _exit(0); } sleep(5); return 0; }
From: Rainer Weikusat on 12 Aug 2010 10:34 Rainer Weikusat <rweikusat(a)mssgmbh.com> writes: [...] > by trying to acquire a write lease on this file [...] > while (fcntl(fd, F_SETLEASE, F_RDLCK) == -1) { This should of course have been 'a read lease', as done in the sample code.
From: Danmath on 12 Aug 2010 11:24
On 12 ago, 11:32, Rainer Weikusat <rweiku...(a)mssgmbh.com> wrote: > I assume that you are targetting Linux since you are using it for > posting. No. I post from personal computers. Running Windows in this case. This application runs on a server, AIX currently. But I don't want to program specificaly for a single OS. |