USB drives, caching and sync [Embedded]

Prev: Software Development opportunity - High Availability / Embedded/ Middleware / Linux
Next: Looking for programming languages that compile to C

From: Boudewijn Dijkstra on 15 Jan 2010 05:12

Op Fri, 15 Jan 2010 10:42:06 +0100 schreef Noob <root(a)127.0.0.1>:
> Boudewijn Dijkstra wrote:
>> Noob wrote:
>>
>>> If the second thread is "in the middle" of a write, the OS would queue
>>> the flush, so that everything happens as expected, right?
>>
>> Either the OS has received the data to be written, or it hasn't. When it
>> hasn't, a sync cannot affect it.
>
> Consider a scenario where thread A needs to write 100 MB to the hard
> disk drive i.e. thread A calls write(fd, buf, 100*1000*1000);
>
> When the OS has "processed" approximately half the data, another thread
> calls sync.
>
> Would the sync operation be queued and carried out only after the write
> completes?
>
> Or would the OS perform the sync operation "immediately", flushing data
> and metadata cached up to this point? In that case, when the write
> completes, the first half of the data would be guaranteed to be written
> to disk, while parts of the second half may still be cached.
>
> (I think the second scenario is more plausible.)

The answer depends on your system. Often the only guarantee is that the
flush operation is scheduled when sync() returns.
http://opengroup.org/onlinepubs/007908799/xsh/sync.html

> In other words, I think I'm asking: is there some form of atomicity of
> file system calls?

Often it does, but the granularity of the 'atoms' depends on the FS
implementation.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

From: Paul Keinanen on 15 Jan 2010 14:27

On Fri, 15 Jan 2010 08:44:33 +0000, Nobody <nobody(a)nowhere.com> wrote:

>On Thu, 14 Jan 2010 17:57:22 +0200, Paul Keinanen wrote:
>
>> On a single processor system, how could two threads call sync
>> simultaneously ?
>>
>> Unless the system call is explicitly made preemptable, assuming
>> user/kernel mode switching during the call, this could not happen.
>
>It would be a pretty lousy OS if sync() wasn't pre-emptible, given
>that it has to be one of the slowest system calls there is. Think about
>it: sync() spends most of its time blocked waiting for the disk to
>indicate either that it's ready for more data or has completed outstanding
>writes.

The only usage for sync() I have seen is to write out any buffers
prior to shutting down the entire system or disconnect a removable
media.

In such a situation, it is really preferable to have a blocking call.
Recent Linux versions use blocking calls, while at least some older
HP-UX versions the sync() returned before writing out everything (and
hence the instruction of entering twice the sync command manually
before powering down, so that the first sync completed, while the
operator entered the second sync :-).

Does the OS21 have fsync() to write out only a single file ? Of
course, writing to the file and fsyncing should be done from the same
thread.

From: Nobody on 16 Jan 2010 05:04

On Fri, 15 Jan 2010 10:42:06 +0100, Noob wrote:

>>> If the second thread is "in the middle" of a write, the OS would queue
>>> the flush, so that everything happens as expected, right?
>>
>> Either the OS has received the data to be written, or it hasn't. When it
>> hasn't, a sync cannot affect it.
>
> Consider a scenario where thread A needs to write 100 MB to the hard
> disk drive i.e. thread A calls write(fd, buf, 100*1000*1000);
>
> When the OS has "processed" approximately half the data, another thread
> calls sync.
>
> Would the sync operation be queued and carried out only after the write
> completes?

Maybe, maybe not. The OS can switch between threads as it sees fit. If a
write() takes a long time, it may suspend the thread and schedule another
thread, or it may not. If the other thread calls sync(), it may start
it immediately, or it may decide that's a good point to suspend the thread.

> Or would the OS perform the sync operation "immediately", flushing data
> and metadata cached up to this point? In that case, when the write
> completes, the first half of the data would be guaranteed to be written
> to disk, while parts of the second half may still be cached.
>
> (I think the second scenario is more plausible.)
>
> In other words, I think I'm asking: is there some form of atomicity of
> file system calls?

I don't know about OS21/OS+, but POSIX provides very few atomicity
guarantees (and those which it does provide may not always be honoured;
it used to be a standing joke that "NFS" stood for "Not a File System"
because of its failure to provide even minimal atomicity).

From: Nobody on 16 Jan 2010 05:47

On Fri, 15 Jan 2010 21:27:56 +0200, Paul Keinanen wrote:

> The only usage for sync() I have seen is to write out any buffers
> prior to shutting down the entire system or disconnect a removable
> media.

Nowadays, that's normally handled by umount(). With a sync(), nothing
prevents the device from being modified the moment the sync() completes.
With umount(), the call will fail if the device is still in use; as soon
as the umount() commences, further operations are excluded.

[With the root filesystem, it's more complex, as it's always "in use".
Typically, it's remounted read-only once everything except init has been
killed.]

> In such a situation, it is really preferable to have a blocking call.
> Recent Linux versions use blocking calls, while at least some older
> HP-UX versions the sync() returned before writing out everything (and
> hence the instruction of entering twice the sync command manually
> before powering down, so that the first sync completed, while the
> operator entered the second sync :-).

In general, a "sync" command (as opposed to the sync() system call) isn't
guaranteed to fully synchronise the filesystem, as its execution may
result in the last-access time of the executable (or a shared library on
which it depends) being updated after the sync() call completes.

From: Noob on 18 Jan 2010 07:38

Paul Keinanen wrote:

> On Fri, 15 Jan 2010 08:44:33 +0000, Nobody wrote:
>
>> On Thu, 14 Jan 2010 17:57:22 +0200, Paul Keinanen wrote:
>>
>>> On a single processor system, how could two threads call sync
>>> simultaneously ?
>>>
>>> Unless the system call is explicitly made preemptable, assuming
>>> user/kernel mode switching during the call, this could not happen.
>>
>> It would be a pretty lousy OS if sync() wasn't pre-emptible, given
>> that it has to be one of the slowest system calls there is. Think about
>> it: sync() spends most of its time blocked waiting for the disk to
>> indicate either that it's ready for more data or has completed outstanding
>> writes.
>
> The only usage for sync() I have seen is to write out any buffers
> prior to shutting down the entire system or disconnect a removable
> media.

In the case of a USB HDD, I must call sync every time I close a file I've
written to, because I don't know when the user will unplug the disk.

> In such a situation, it is really preferable to have a blocking call.
> Recent Linux versions use blocking calls, while at least some older
> HP-UX versions the sync() returned before writing out everything (and
> hence the instruction of entering twice the sync command manually
> before powering down, so that the first sync completed, while the
> operator entered the second sync :-).
>
> Does the OS21 have fsync() to write out only a single file ?

There is a vfs_fflush, but I think vfs_close already calls it.

My problem is that, if I don't call vfs_sync after closing a file,
the disk is still considered "dirty", and the OS forces an fsck when
it sees the disk again.

> Of course, writing to the file and fsyncing should be done from the same
> thread.

OK, but many threads may be writing to the disk "at the same time"
i.e. concurrently, whereas vfs_sync is supposed to flush everything
to disk (even data and metadata from other threads).

Regards.

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: Software Development opportunity - High Availability / Embedded/ Middleware / Linux
Next: Looking for programming languages that compile to C