From: Chris Friesen on 12 May 2010 15:53 On 05/12/2010 01:33 PM, Rainer Weikusat wrote: > Chris Friesen <cbf123(a)mail.usask.ca> writes: >> On 05/10/2010 01:01 PM, yirgster wrote: >> >>> But, from a "legal" standpoint, >>> >>> (1) this isn't required behavior by posix, that is, that fsync(fd) >>> sync all mmap'd(fd) memory too. >> >> Contrary to Rainer, I think it actually might be implied by posix, and >> that's why the various OS's have changed their behaviour. >> >> The posix language reads "all data for the open file descriptor named by >> fildes is to be transferred to the storage device associated with the >> file described by fildes." Arguably, memory ranges mmap'd from a file >> is "data for the open file descriptor". > > The situation isn't that simple, eg, it is legal to close a file > descriptor after it was used to establish a memory mapping and to > continue using the mapping. Assuming that the file is later reopened, > is whatever the existing memory mapping contains necessarily 'data for > the new file descriptor' (or only if the implementation happens to > have a unified cache)? I agree that the wording is a bit unclear, but they left it that way on purpose. From the posix rationale: "The fsync() function is intended to force a physical write of data from the buffer cache, and to assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk. Since the concepts of "buffer cache", "system crash", "physical write", and "non-volatile storage" are not defined here, the wording has to be more abstract." Based on the above, I see no reason to treat data modified via memory mappings any different than data written by a write() syscall. That said, if _POSIX_SYNCHRONIZED_IO is not defined, the spec explicitly allows a null implementation of fcntl()...but it must be documented in the compliance document. Chris
From: yirgster on 12 May 2010 16:29 On May 12, 12:33 pm, Rainer Weikusat <rweiku...(a)mssgmbh.com> wrote: > Chris Friesen <cbf...(a)mail.usask.ca> writes: > > On 05/10/2010 01:01 PM, yirgster wrote: > > >> But, from a "legal" standpoint, > > >> (1) this isn't required behavior by posix, that is, that fsync(fd) > >> sync all mmap'd(fd) memory too. > > > Contrary to Rainer, I think it actually might be implied by posix, and > > that's why the various OS's have changed their behaviour. > > > The posix language reads "all data for the open file descriptor named by > > fildes is to be transferred to the storage device associated with the > > file described by fildes." Arguably, memory ranges mmap'd from a file > > is "data for the open file descriptor". > > The situation isn't that simple, eg, it is legal to close a file > descriptor after it was used to establish a memory mapping and to > continue using the mapping. Assuming that the file is later reopened, > is whatever the existing memory mapping contains necessarily 'data for > the new file descriptor' (or only if the implementation happens to > have a unified cache)? Under msync(MS_SYNC) it would have had to make it out to disk, so it will be seen by any open and file access that follows after. I've assumed all along that we've been talking about mmap(... MAP_SHARED ...)
From: Ersek, Laszlo on 12 May 2010 18:12 On Wed, 12 May 2010, Rainer Weikusat wrote: > Chris Friesen <cbf123(a)mail.usask.ca> writes: >> On 05/10/2010 01:01 PM, yirgster wrote: >> >>> But, from a "legal" standpoint, >>> >>> (1) this isn't required behavior by posix, that is, that fsync(fd) >>> sync all mmap'd(fd) memory too. >> >> Contrary to Rainer, I think it actually might be implied by posix, and >> that's why the various OS's have changed their behaviour. >> >> The posix language reads "all data for the open file descriptor named >> by fildes is to be transferred to the storage device associated with >> the file described by fildes." Arguably, memory ranges mmap'd from a >> file is "data for the open file descriptor". > > The situation isn't that simple, eg, it is legal to close a file > descriptor after it was used to establish a memory mapping and to > continue using the mapping. Assuming that the file is later reopened, is > whatever the existing memory mapping contains necessarily 'data for the > new file descriptor' (or only if the implementation happens to have a > unified cache)? I don't think so. POSIX very carefully distinguishes file descriptor from file description from file. The language quoted above is "all data for the open file descriptor". Ie. the distinction is made on the most specific (least shared) level. If you dup()licate a file descriptor, you get a new descriptor referring to the same open file description [0] [1]. But fsync() only needs to synchronize changes made through the exact file descriptor that is passed to it. If the spec went a single level deeper, ie. to file description, that would require an fsync() call issued by process A to synchronize changes made by process B with write(), for which B used a descriptor that it inherited from A through a series of fork()s and exec()s, or one that it received over a UNIX domain socket with SCM_RIGHTS. (Btw, I found only one mention of "SCM_RIGHTS" in SUSv4 [2], and it only "Indicates that the data array contains the access rights to be sent or received." The Linux manual is more specific [3]: it not only mentions that the "access rights" are file descriptors, but it also states that SCM_RIGHTS is effectively a cross-process dup().) Therefore, it seems to me, once you close a file descriptor, you may lose any opportunity to fsync() the changes made through it. I can't imagine that fsync() -- being permitted to ignore any changes made through a different file descriptor -- would be *required* to care about modifications performed through something that is not even a file description. In closing, if you don't mind, I'll quote myself; it seems relevant to some extent. ----v---- Date: Fri, 2 Apr 2010 20:58:22 +0200 From: "Ersek, Laszlo" <lacos(a)caesar.elte.hu> Newsgroups: comp.programming.threads, comp.unix.programmer, comp.os.linux.development.system, comp.os.linux.development.apps Subject: Re: IPC based on name pipe FIFO and transaction log file Message-ID: <Pine.LNX.4.64.1004021950500.19039(a)login01.caesar.elte.hu> [snip] Would anybody please validate the following table? +-------------+----------------------------------------------------------------+ | change made | change visible via | | through +----------------------------+-------------+---------------------+ | | MAP_SHARED | MAP_PRIVATE | read() | +-------------+----------------------------+-------------+---------------------+ | MAP_SHARED | yes | unspecified | depends on MS_SYNC, | | | | | MS_ASYNC, or normal | | | | | system activity | +-------------+----------------------------+-------------+---------------------+ | MAP_PRIVATE | no | no | no | +-------------+----------------------------+-------------+---------------------+ | write() | depends on MS_INVALIDATE, | unspecified | yes | | | or the system's read/write | | | | | consistency | | | +-------------+----------------------------+-------------+---------------------+ ----^---- Cheers, lacos [0] http://www.opengroup.org/onlinepubs/9699919799/functions/dup.html [1] http://www.opengroup.org/onlinepubs/9699919799/functions/fcntl.html [2] http://www.opengroup.org/onlinepubs/9699919799/basedefs/sys_socket.h.html [3] http://www.kernel.org/doc/man-pages/online/pages/man7/unix.7.html
From: yirgster on 12 May 2010 19:41 On May 12, 12:53 pm, Chris Friesen <cbf...(a)mail.usask.ca> wrote: > On 05/12/2010 01:33 PM, Rainer Weikusat wrote: > > > > > Chris Friesen <cbf...(a)mail.usask.ca> writes: > >> On 05/10/2010 01:01 PM, yirgster wrote: > > >>> But, from a "legal" standpoint, > > >>> (1) this isn't required behavior by posix, that is, that fsync(fd) > >>> sync all mmap'd(fd) memory too. > > >> Contrary to Rainer, I think it actually might be implied by posix, and > >> that's why the various OS's have changed their behaviour. > > >> The posix language reads "all data for the open file descriptor named by > >> fildes is to be transferred to the storage device associated with the > >> file described by fildes." Arguably, memory ranges mmap'd from a file > >> is "data for the open file descriptor". > > > The situation isn't that simple, eg, it is legal to close a file > > descriptor after it was used to establish a memory mapping and to > > continue using the mapping. Assuming that the file is later reopened, > > is whatever the existing memory mapping contains necessarily 'data for > > the new file descriptor' (or only if the implementation happens to > > have a unified cache)? > > I agree that the wording is a bit unclear, but they left it that way on > purpose. From the posix rationale: > > "The fsync() function is intended to force a physical write of data from > the buffer cache, and to assure that after a system crash or other > failure that all data up to the time of the fsync() call is recorded on > the disk. Since the concepts of "buffer cache", "system crash", > "physical write", and "non-volatile storage" are not defined here, the > wording has to be more abstract." > > Based on the above, I see no reason to treat data modified via memory > mappings any different than data written by a write() syscall. > > That said, if _POSIX_SYNCHRONIZED_IO is not defined, the spec explicitly > allows a null implementation of fcntl()...but it must be documented in > the compliance document. > > Chris I still don't think it's proven since "all data up to the time of fsync()" seems conditioned on the preceding phrase "physical write of data from the buffer cache." So, we're back to the unified buffer cache issue.
From: yirgster on 12 May 2010 20:22 lacos writes: > [ snip ] > POSIX very carefully distinguishes file descriptor from file description > from file. The language quoted above is "all data for the open file > descriptor". Ie. the distinction is made on the most specific (least > shared) level. If you dup()licate a file descriptor, you get a new > descriptor referring to the same open file description [0] [1]. But > fsync() only needs to synchronize changes made through the exact file > descriptor that is passed to it. I agree with this reading. You know, looking at the discussion this issue has engendered, and assuming yours is an absolutely correct reading based on the writing (as I think it is), it still should have been more explicitly clarified in the doc, e.g., "It doesn't apply to other fd's even in the same process." I mean, the purpose is understanding and clarity, no? Not Talmudic scholarship. > [snip socket stuff -- I have no idea] > Therefore, it seems to me, once you close a file descriptor, you may lose > any opportunity to fsync() the changes made through it. Yes, I agree with this too. > I can't imagine that fsync() -- being permitted to ignore any changes made > through a different file descriptor -- would be *required* to care about > modifications performed through something that is not even a file > description. Sounds correct. But, it's not relevant to the issue of mmap() of a file description being implied by fsync of the same fd. > In closing, if you don't mind, I'll quote myself; it seems relevant to > some extent. I rarely mind advertisements for myself. Even from others. I do it all the time. > Would anybody please validate the following table? Validate your table? I am sufficiently trustworthy (forget knowledgeable)? > +-------------+----------------------------------------------------------------+ > | change made | change visible via | > | through +----------------------------+-------------+---------------------+ > | | MAP_SHARED | MAP_PRIVATE | read() | > +-------------+----------------------------+-------------+---------------------+ > | MAP_SHARED | yes | unspecified | depends on MS_SYNC, | > | | | | MS_ASYNC, or normal | > | | | | system activity | > +-------------+----------------------------+-------------+---------------------+ > | MAP_PRIVATE | no | no | no | > +-------------+----------------------------+-------------+---------------------+ > | write() | depends on MS_INVALIDATE, | unspecified | yes | > | | or the system's read/write | | | > | | consistency | | | > +-------------+----------------------------+-------------+---------------------+ Well, I'm not sure I understand your table completely. But here goes: Under MAP_PRIVATE, 2nd row, I don't understand the qualifications. It simply seems to me: unspecified. From the mmap() page: "It is unspecified whether modifications to the underlying object done after the MAP_PRIVATE mapping is established are visible through the MAP_PRIVATE mapping." So what would MS_SYNC, MS_ASYNC, have to do with it? MS_INVALIDATE: there's a reality problem here, I believe. This is that, from reading other posts on this subject back around 2002-2004, that it's basically a no-op in some of the os's (linux? - I can't look at the source now.) Also, it would be pretty hard to test, no? Isn't it the same race condition between say, the store buffers and memory cache consistency, of recent discussion in the threads group. Speaking of reality (but why should this interfere with our thinking), I know--i.e., actually seen, I'm not talking theoretically--a case in which linux (at that time at least) did not in one instance meet the posix spec. I keep thinking it was in zero'ing out the last page of the file correctly. But this seems too obvious. Whatever it was, it worked properly on Solaris, AIX, and HP. I saw it. Well, got to show some motion at work. Hope you're not so unfortunate.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Problem with getopt_long_only() (GNU/Linux) Next: NETWORK MARKETING |