Prev: Software Development opportunity - High Availability / Embedded/ Middleware / Linux
Next: Looking for programming languages that compile to C
From: Noob on 11 Jan 2010 06:04 Hello everyone, I'm working on a digital television receiver which can record a transport stream to an external USB hard disk drive. Whenever I write to the HDD, whether a few bytes or an entire program, if I don't call sync() before I pull the USB plug out, when I re-plug the drive, the OS needs to run fsck. (NB : filesystem is FAT32, OS is OS21/OS+) I'm aware that the OS must be performing some form of caching, and marks the drive as dirty once I write something to the drive. The documentation states: """ File system state Many subordinate file systems perform caching to improve performance and therefore need to track whether the on-disk structure is in a consistent state. When the volume is first mounted, with vfs_mount(), it is clean. Any partial writes or unwritten cached data means that it may be dirty or inconsistent until the cache is flushed. The vfs_sync() and vfs_sync_all() functions flush the cache and mark the volume as clean. When the volume is unmounted, using vfs_umount(), the cache is flushed and the volume is marked as clean. Many file systems record the current state in the on-disk structures and this value is one of the things checked by the vfs_mount() function. If the volume is dirty the vfs_mount() fails with errno set to EAGAIN to signify the volume needs vfs_fsck() to be run to restore it to a consistent state. """ If I'm streaming large (770 kB) blocks to the disk, I don't care about caching, as I'm not going to read these data for a long time. I would need some way to turn caching off, which means every write should commit to the disk. Would this be equivalent to calling sync after every write? When I'm done writing my file, I do call fflush and close on the file descriptor (the fflush should be redundant) but that does not seem sufficient to convince the OS to flush the file's data and metadata to the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only commits that on sync?) What happens if two threads call sync "at the same time"? Is sync supposed to be reentrant? or thread-safe? (I suppose it will depend on the OS?) Regards.
From: Boudewijn Dijkstra on 11 Jan 2010 07:52 Op Mon, 11 Jan 2010 12:04:40 +0100 schreef Noob <root(a)127.0.0.1>: > Hello everyone, > > I'm working on a digital television receiver which can record a > transport stream to an external USB hard disk drive. > > Whenever I write to the HDD, whether a few bytes or an entire program, > if I don't call sync() before I pull the USB plug out, when I re-plug > the drive, the OS needs to run fsck. > > (NB : filesystem is FAT32, OS is OS21/OS+) > > I'm aware that the OS must be performing some form of caching, and marks > the drive as dirty once I write something to the drive. > > [...] > > If I'm streaming large (770 kB) blocks to the disk, I don't care about > caching, as I'm not going to read these data for a long time. Read caching and write caching are two different uses. > I would need some way to turn caching off, which means every write > should commit to the disk. Would this be equivalent to calling sync > after every write? Or use open() with O_SYNC. > When I'm done writing my file, I do call fflush and close on the file > descriptor (the fflush should be redundant) but that does not seem > sufficient to convince the OS to flush the file's data and metadata to > the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only > commits that on sync?) Your documentation should state which calls affect data only and which affect also metadata. > What happens if two threads call sync "at the same time"? > Is sync supposed to be reentrant? or thread-safe? > (I suppose it will depend on the OS?) POSIX calls should be reentrant/thread-safe unless specified otherwise. -- Gemaakt met Opera's revolutionaire e-mailprogramma: http://www.opera.com/mail/ (remove the obvious prefix to reply by mail)
From: Didi on 11 Jan 2010 08:28 On Jan 11, 1:04 pm, Noob <r...(a)127.0.0.1> wrote: > ... > If I'm streaming large (770 kB) blocks to the disk, I don't care about > caching, as I'm not going to read these data for a long time. > > I would need some way to turn caching off, which means every write > should commit to the disk. Would this be equivalent to calling sync > after every write? Assuming kB is a typo and you mean MB - 770 kB is small nowadays - I am still not sure you will be better off if you turn the cache off. Depending on the OS, this may result in more disk overhead than you are happy with - e.g. if the OS updates the directory entry every time you write to the file (last modified); it can get even worse if it decides to do that recursively for the directory entries of the entire path (I don't know how many OSs would do the latter; I know DPS would do it if the directories are explicitly of that type, which I don't think I ever really used after I implemented it.... :-) ). > What happens if two threads call sync "at the same time"? > Is sync supposed to be reentrant? or thread-safe? > (I suppose it will depend on the OS?) Well this will depend on the OS but one which works won't have a problem with it :-). The process which first gains access will do the "sync", the second one will find it has nothing to do etc. Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ Original message: http://groups.google.com/group/comp.arch.embedded/msg/b651aba83b7587d9?dmode=source
From: Noob on 11 Jan 2010 08:36 Boudewijn Dijkstra wrote: > Noob wrote: > >> Hello everyone, >> >> I'm working on a digital television receiver which can record a >> transport stream to an external USB hard disk drive. >> >> Whenever I write to the HDD, whether a few bytes or an entire program, >> if I don't call sync() before I pull the USB plug out, when I re-plug >> the drive, the OS needs to run fsck. >> >> (NB : filesystem is FAT32, OS is OS21/OS+) >> >> I'm aware that the OS must be performing some form of caching, and >> marks the drive as dirty once I write something to the drive. >> >> [...] >> >> If I'm streaming large (770 kB) blocks to the disk, I don't care about >> caching, as I'm not going to read these data for a long time. > > Read caching and write caching are two different uses. You're right. I had write-caching in mind. >> I would need some way to turn caching off, which means every write >> should commit to the disk. Would this be equivalent to calling sync >> after every write? > > Or use open() with O_SYNC. Unfortunately, vfs_open does not seem to accept an O_SYNC flag. >> When I'm done writing my file, I do call fflush and close on the file >> descriptor (the fflush should be redundant) but that does not seem >> sufficient to convince the OS to flush the file's data and metadata to >> the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only >> commits that on sync?) > > Your documentation should state which calls affect data only and which > affect also metadata. I would assume that most writes require an update of the FAT, which one might consider metadata? >> What happens if two threads call sync "at the same time"? >> Is sync supposed to be reentrant? or thread-safe? >> (I suppose it will depend on the OS?) > > POSIX calls should be reentrant/thread-safe unless specified otherwise. I don't think OS21/OS+ claims POSIX conformance. I will have to check. Regards.
From: Boudewijn Dijkstra on 13 Jan 2010 06:18
Op Mon, 11 Jan 2010 14:36:43 +0100 schreef Noob <root(a)127.0.0.1>: > Boudewijn Dijkstra wrote: >> Noob wrote: >> > [...] > >>> When I'm done writing my file, I do call fflush and close on the file >>> descriptor (the fflush should be redundant) but that does not seem >>> sufficient to convince the OS to flush the file's data and metadata to >>> the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only >>> commits that on sync?) >> >> Your documentation should state which calls affect data only and which >> affect also metadata. > > I would assume that most writes require an update of the FAT, which one > might consider metadata? Yes sorry, most writes require a metadata update at some point, but there are only a few ways of forcing the system to do it immediately. >>> What happens if two threads call sync "at the same time"? >>> Is sync supposed to be reentrant? or thread-safe? >>> (I suppose it will depend on the OS?) >> >> POSIX calls should be reentrant/thread-safe unless specified otherwise. > > I don't think OS21/OS+ claims POSIX conformance. I will have to check. In fact, all OS calls should be reentrant/thread-safe unless specified otherwise. -- Gemaakt met Opera's revolutionaire e-mailprogramma: http://www.opera.com/mail/ (remove the obvious prefix to reply by mail) |