From: Phil on 21 Jun 2008 11:46 Dear Experts, I have a program which mmap()s its read-mostly data file. If I run two instances of the program concurrently, I want the changes made by one to be visible to the other. So I call mmap with the MAP_SHARED flag. After I make a change to the data I call msync(). Since I'm using MAP_SHARED I don't believe that msync() should be necessary in order for the other instance to see the changes; however, I also want to ensure that the change is stored on the disk so that it won't be lost if the program terminates and msync() seems to be the right way to do this. According to the Linux man page for msync(), there's a flag MS_INVALIDATE that I can pass to it that "asks to invalidate other mappings of the same file (so that they can be updated with the fresh values just written)". [I should say that I'm running this on Linux, but portable code is always better.] This seems to be suggesting that if I don't set this flag, other mappings of the file (i.e. presumably mappings in other processes) won't see the new values. But that's not what MAP_SHARED is supposed to do, is it? Here's the POSIX description of MS_INVALIDATE: "When MS_INVALIDATE is specified, msync() shall invalidate all cached copies of mapped data that are inconsistent with the permanent storage locations such that subsequent references shall obtain data that was consistent with the permanent storage locations sometime between the call to msync() and the first subsequent memory reference to the data." What seems to happen in practice is that, after I msync(MS_INVALIDATE) the small range of pages that I've changed, all of the rest of the file's pages are lost from the cache; when I subsequently read the file it brings it in from the disk again [and this is how I first noticed that something was wrong, as I experienced a peculiar pause while it did this]. Does anyone know what's going on with these calls? Thanks, Phil.
From: Moi on 21 Jun 2008 12:43 On Sat, 21 Jun 2008 16:46:03 +0100, Phil wrote: > Dear Experts, > > I have a program which mmap()s its read-mostly data file. > > If I run two instances of the program concurrently, I want the changes > made by one to be visible to the other. So I call mmap with the > MAP_SHARED flag. > > After I make a change to the data I call msync(). Since I'm using > MAP_SHARED I don't believe that msync() should be necessary in order for > the other instance to see the changes; however, I also want to ensure > that the change is stored on the disk so that it won't be lost if the > program terminates and msync() seems to be the right way to do this. The msync should not be necessary. Both programs have the *same* buffer mapped into their address space. They should see the same data. > > According to the Linux man page for msync(), there's a flag > MS_INVALIDATE that I can pass to it that "asks to invalidate other > mappings of the same file (so that they can be updated with the fresh > values just written)". [I should say that I'm running this on Linux, > but portable code is always better.] This seems to be suggesting that > if I don't set this flag, other mappings of the file (i.e. presumably > mappings in other processes) won't see the new values. But that's not > what MAP_SHARED is supposed to do, is it? > > Here's the POSIX description of MS_INVALIDATE: > > "When MS_INVALIDATE is specified, msync() shall invalidate all cached > copies of mapped data that are inconsistent with the permanent storage > locations such that subsequent references shall obtain data that was > consistent with the permanent storage locations sometime between the > call to msync() and the first subsequent memory reference to the data." > > What seems to happen in practice is that, after I msync(MS_INVALIDATE) > the small range of pages that I've changed, all of the rest of the > file's pages are lost from the cache; when I subsequently read the file > it brings it in from the disk again [and this is how I first noticed > that something was wrong, as I experienced a peculiar pause while it did > this]. As I understand it, the MS_INVALIDATE flag just means: if there is a buffer present for the affected pages: if (this buffer is dirty) { - mark this buffer as NON_DIRTY - mark this buffer as NON_VALID. } else { -- do nothing } Marking the buffer invalid will cause it to be read in from disk whenever it is referenced again. Marking it as nondirty will cause any changes to the page to be lost. (but they *might* have been written to disk *before* the msync(... , MS_INVALIDATE) -call. The system may write to backing storage whenever it wishes) HTH, AvK
From: guenther on 21 Jun 2008 17:55 On Jun 21, 9:46 am, Phil <spam_from_usene...(a)chezphil.org> wrote: > If I run two instances of the program concurrently, I want the > changes made by one to be visible to the other. So I call > mmap with the MAP_SHARED flag. > > After I make a change to the data I call msync(). Since I'm > using MAP_SHARED I don't believe that msync() should be > necessary in order for the other instance to see the changes; Agreed. > however, I also want to ensure that the change is stored on > the disk so that it won't be lost if the program terminates > and msync() seems to be the right way to do this. Actually, I believe that should be unnecessary. Writes to mappings obtained via mmap(MAP_SHARED) should be visible in other processes under the same rules that cover when writes made in one thread are visible to another thread in the same process, ala: http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_10 msync(MS_ASYNC) and msync(MS_SYNC) are for making sure that changes are written to permanent storage, ala fsync() or fdatasync(). > According to the Linux man page for msync(), there's a flag > MS_INVALIDATE that I can pass to it that "asks to invalidate > other mappings of the same file (so that they can be updated > with the fresh values just written)". <...> This seems to be > suggesting that if I don't set this flag, other mappings of the > file (i.e. presumably mappings in other processes) won't see > the new values. But that's not what MAP_SHARED is supposed > to do, is it? The Linux description seems 'off' to me. My expectation from the SUSv3 description and from experience is that for a MAP_SHARED mappings of a normal file, msync(MS_INVALIDATE) is only necessary if changes are being made to the file using interfaces other than a shared mapping. I.e., if process A uses write() to change the file contents, then process B might not see the change in its shared mapping until it invalidates the 'cached contents' using msync(). Changes made to the file using other shared mapping should be visible immediately. At least that's been my experience across the following platforms: Linux, Solaris, AIX, OpenBSD, FreeBSD. Most of those have shared buffer caches such that msync(MS_INVALIDATE) is never actually needed for shared mappings: changes made via write() are instantly visible via shared mappings. If my memory serves, the real exception was HP-UX on PA-RISC, where there was NO SUPPORTED METHOD to make a change visible via write() visible to a shared mapping. Completely unmapping the file and remapping it could leave you with old data in the mapping! There were other horrible restrictions on shared mappings (they had to have the same VM address *in all processes*) that made using mmap() there completely unworkable for files that needed to grow while being shared. Philip Guenther
From: Phil on 22 Jun 2008 10:24 guenther(a)gmail.com wrote: > On Jun 21, 9:46 am, Phil <spam_from_usene...(a)chezphil.org> wrote: >> According to the Linux man page for msync(), there's a flag >> MS_INVALIDATE that I can pass to it that "asks to invalidate >> other mappings of the same file (so that they can be updated >> with the fresh values just written)". <...> This seems to be >> suggesting that if I don't set this flag, other mappings of the >> file (i.e. presumably mappings in other processes) won't see >> the new values. But that's not what MAP_SHARED is supposed >> to do, is it? > > The Linux description seems 'off' to me. My expectation from the > SUSv3 description and from experience is that for a MAP_SHARED > mappings of a normal file, msync(MS_INVALIDATE) is only necessary if > changes are being made to the file using interfaces other than a > shared mapping. I.e., if process A uses write() to change the file > contents, then process B might not see the change in its shared > mapping until it invalidates the 'cached contents' using msync(). > Changes made to the file using other shared mapping should be visible > immediately. Right. Agreed. I've had a look at the Linux source code (in mm/msync.c) and I believe that MS_INVALIDATE actually does nothing at all. (What's more, it uses the msync address range to determine which mapped files are affected, but then syncs the *whole* of each of those files!) Now I have the further challenge of working out how much of this still works if the mmap()ed files are on NFS (or CIFS, or whatever). I think that msync() is necessary to ensure that the changes have reached the server; then, you have to wait for maybe a 30 second timeout before another client will re-validate and discover the changes; but unfortunately it doesn't know about the address range that changed, so it invalidates all of its pages for that file. Has anyone ever tried this sort of thing? Cheers, Phil.
From: phil-news-nospam on 26 Jun 2008 13:41 On Sat, 21 Jun 2008 14:55:27 -0700 (PDT) guenther(a)gmail.com <guenther(a)gmail.com> wrote: | If my memory serves, the real exception was HP-UX on PA-RISC, where | there was NO SUPPORTED METHOD to make a change visible via write() | visible to a shared mapping. Completely unmapping the file and | remapping it could leave you with old data in the mapping! There were | other horrible restrictions on shared mappings (they had to have the | same VM address *in all processes*) that made using mmap() there | completely unworkable for files that needed to grow while being | shared. My understanding of these limitations is that it is a hardware issue with regard to cache line size being larger than a page size. It would be possible to map a given page at a different virtual memory address, whether in the same process or a different once, provided that its offset relative to the cache line size was the same. Linux solved this for ARM architecture (cache line size 16K, page size 4K) by means of enforcements in mmap() that would refuse to do the mapping if the offset was wrong for a requested address. The OS on the HP-UX and PA-RISC architectures may have elected to not do this at all. I believe the cache line size for HP-UX is 1M. I'd have to go find old notes, but there is a macro symbol for Linux that tells an application what the mapping alignment base is, which would be the larger of the page size and the cache line size. I encountered this because glibc and uClibc both at one time had this value set wrong for ARM while Linux had it set right and was enforcing it. I only encountered it because of the double mapping in my VRB library ( http://vrb.slashusr.org/ ). I have not yet tested this at all on HP-UX or PA-RISC (anyone have a portable emulator for these?). -- |WARNING: Due to extreme spam, googlegroups.com is blocked. Due to ignorance | | by the abuse department, bellsouth.net is blocked. If you post to | | Usenet from these places, find another Usenet provider ASAP. | | Phil Howard KA9WGN (email for humans: first name in lower case at ipal.net) |
|
Next
|
Last
Pages: 1 2 Prev: how to handle socket timeout? Next: gdb vs fortran RTL -- fight for SIGSEGV |