From: Noob on
Hello everyone,

I'm working on a digital television receiver which can record a
transport stream to an external USB hard disk drive.

Whenever I write to the HDD, whether a few bytes or an entire program,
if I don't call sync() before I pull the USB plug out, when I re-plug
the drive, the OS needs to run fsck.

(NB : filesystem is FAT32, OS is OS21/OS+)

I'm aware that the OS must be performing some form of caching, and marks
the drive as dirty once I write something to the drive.

The documentation states:

"""
File system state

Many subordinate file systems perform caching to improve performance and
therefore need to track whether the on-disk structure is in a consistent
state.

When the volume is first mounted, with vfs_mount(), it is clean. Any
partial writes or unwritten cached data means that it may be dirty or
inconsistent until the cache is flushed. The vfs_sync() and
vfs_sync_all() functions flush the cache and mark the volume as clean.
When the volume is unmounted, using vfs_umount(), the cache is flushed
and the volume is marked as clean.

Many file systems record the current state in the on-disk structures and
this value is one of the things checked by the vfs_mount() function. If
the volume is dirty the vfs_mount() fails with errno set to EAGAIN to
signify the volume needs vfs_fsck() to be run to restore it to a
consistent state.
"""

If I'm streaming large (770 kB) blocks to the disk, I don't care about
caching, as I'm not going to read these data for a long time.

I would need some way to turn caching off, which means every write
should commit to the disk. Would this be equivalent to calling sync
after every write?

When I'm done writing my file, I do call fflush and close on the file
descriptor (the fflush should be redundant) but that does not seem
sufficient to convince the OS to flush the file's data and metadata to
the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only
commits that on sync?)

What happens if two threads call sync "at the same time"?
Is sync supposed to be reentrant? or thread-safe?
(I suppose it will depend on the OS?)

Regards.
From: Boudewijn Dijkstra on
Op Mon, 11 Jan 2010 12:04:40 +0100 schreef Noob <root(a)127.0.0.1>:
> Hello everyone,
>
> I'm working on a digital television receiver which can record a
> transport stream to an external USB hard disk drive.
>
> Whenever I write to the HDD, whether a few bytes or an entire program,
> if I don't call sync() before I pull the USB plug out, when I re-plug
> the drive, the OS needs to run fsck.
>
> (NB : filesystem is FAT32, OS is OS21/OS+)
>
> I'm aware that the OS must be performing some form of caching, and marks
> the drive as dirty once I write something to the drive.
>
> [...]
>
> If I'm streaming large (770 kB) blocks to the disk, I don't care about
> caching, as I'm not going to read these data for a long time.

Read caching and write caching are two different uses.

> I would need some way to turn caching off, which means every write
> should commit to the disk. Would this be equivalent to calling sync
> after every write?

Or use open() with O_SYNC.

> When I'm done writing my file, I do call fflush and close on the file
> descriptor (the fflush should be redundant) but that does not seem
> sufficient to convince the OS to flush the file's data and metadata to
> the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only
> commits that on sync?)

Your documentation should state which calls affect data only and which
affect also metadata.

> What happens if two threads call sync "at the same time"?
> Is sync supposed to be reentrant? or thread-safe?
> (I suppose it will depend on the OS?)

POSIX calls should be reentrant/thread-safe unless specified otherwise.


--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)
From: Didi on
On Jan 11, 1:04 pm, Noob <r...(a)127.0.0.1> wrote:
> ...
> If I'm streaming large (770 kB) blocks to the disk, I don't care about
> caching, as I'm not going to read these data for a long time.
>
> I would need some way to turn caching off, which means every write
> should commit to the disk. Would this be equivalent to calling sync
> after every write?

Assuming kB is a typo and you mean MB - 770 kB is small nowadays - I
am still not sure you will be better off if you turn the cache off.
Depending on the OS, this may result in more disk overhead than you
are happy with - e.g. if the OS updates the directory entry every
time you write to the file (last modified); it can get even worse
if it decides to do that recursively for the directory entries of
the entire path (I don't know how many OSs would do the latter; I
know
DPS would do it if the directories are explicitly of that type, which
I don't think I ever really used after I implemented it.... :-) ).

> What happens if two threads call sync "at the same time"?
> Is sync supposed to be reentrant? or thread-safe?
> (I suppose it will depend on the OS?)

Well this will depend on the OS but one which works won't have
a problem with it :-). The process which first gains access will
do the "sync", the second one will find it has nothing to do etc.

Dimiter

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.arch.embedded/msg/b651aba83b7587d9?dmode=source


From: Noob on
Boudewijn Dijkstra wrote:

> Noob wrote:
>
>> Hello everyone,
>>
>> I'm working on a digital television receiver which can record a
>> transport stream to an external USB hard disk drive.
>>
>> Whenever I write to the HDD, whether a few bytes or an entire program,
>> if I don't call sync() before I pull the USB plug out, when I re-plug
>> the drive, the OS needs to run fsck.
>>
>> (NB : filesystem is FAT32, OS is OS21/OS+)
>>
>> I'm aware that the OS must be performing some form of caching, and
>> marks the drive as dirty once I write something to the drive.
>>
>> [...]
>>
>> If I'm streaming large (770 kB) blocks to the disk, I don't care about
>> caching, as I'm not going to read these data for a long time.
>
> Read caching and write caching are two different uses.

You're right. I had write-caching in mind.

>> I would need some way to turn caching off, which means every write
>> should commit to the disk. Would this be equivalent to calling sync
>> after every write?
>
> Or use open() with O_SYNC.

Unfortunately, vfs_open does not seem to accept an O_SYNC flag.

>> When I'm done writing my file, I do call fflush and close on the file
>> descriptor (the fflush should be redundant) but that does not seem
>> sufficient to convince the OS to flush the file's data and metadata to
>> the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only
>> commits that on sync?)
>
> Your documentation should state which calls affect data only and which
> affect also metadata.

I would assume that most writes require an update of the FAT, which one
might consider metadata?

>> What happens if two threads call sync "at the same time"?
>> Is sync supposed to be reentrant? or thread-safe?
>> (I suppose it will depend on the OS?)
>
> POSIX calls should be reentrant/thread-safe unless specified otherwise.

I don't think OS21/OS+ claims POSIX conformance. I will have to check.

Regards.
From: Boudewijn Dijkstra on
Op Mon, 11 Jan 2010 14:36:43 +0100 schreef Noob <root(a)127.0.0.1>:
> Boudewijn Dijkstra wrote:
>> Noob wrote:
>>
> [...]
>
>>> When I'm done writing my file, I do call fflush and close on the file
>>> descriptor (the fflush should be redundant) but that does not seem
>>> sufficient to convince the OS to flush the file's data and metadata to
>>> the disk. (Perphaps the OS keeps a copy of the FAT in memory, and only
>>> commits that on sync?)
>>
>> Your documentation should state which calls affect data only and which
>> affect also metadata.
>
> I would assume that most writes require an update of the FAT, which one
> might consider metadata?

Yes sorry, most writes require a metadata update at some point, but there
are only a few ways of forcing the system to do it immediately.

>>> What happens if two threads call sync "at the same time"?
>>> Is sync supposed to be reentrant? or thread-safe?
>>> (I suppose it will depend on the OS?)
>>
>> POSIX calls should be reentrant/thread-safe unless specified otherwise.
>
> I don't think OS21/OS+ claims POSIX conformance. I will have to check.

In fact, all OS calls should be reentrant/thread-safe unless specified
otherwise.


--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)