Trying to design low level hard disk manipulation program [Computer Architecture]

Prev: "Livermore Loops" on x86 Linux
Next: How Many Processor Cores Are Enough?

From: Tarjei T. Jensen on 25 Sep 2006 16:36

Niels J?rgen Kruse wrote:
> File formats are usually compressed already, and you need to know the
> kind of content to get the best compression.

Sorry, they are not yet compressed. It does not mean that we should not
prepare the file system to handle that.

We'll have to see whether future word processing and spreadsheet formats are
compressed enough natively.

greetings,

From: Terje Mathisen on 25 Sep 2006 16:43

Niels J?rgen Kruse wrote:
> Terje Mathisen <terje.mathisen(a)hda.hydro.com> wrote:
>> No, because you need _huge_ lengths to avoid the problem where an areas
>> of the disk is going bad. This really interferes with random access
>> benchmark numbers. :-(
>
> Whether you do it in the file system or in the drive firmware, benchmark
> numbers will suck equally, don't you think?
>
User-level benchmarks: Sure!

Manufacturer's 'Sustained IO rates' numbers. Big difference.

I'm cynical enough to believe that this is enough to not implement it. :-(

Terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: Terje Mathisen on 25 Sep 2006 16:48

Tarjei T. Jensen wrote:
> Niels J?rgen Kruse wrote:
>> File formats are usually compressed already, and you need to know the
>> kind of content to get the best compression.
>
> Sorry, they are not yet compressed. It does not mean that we should not
> prepare the file system to handle that.
>
> We'll have to see whether future word processing and spreadsheet formats
> are compressed enough natively.

I'd be quite happy if just one single app would stop storing all it's
data pessimally:

Microsoft Powerpoint.

Any jpeg images you include in a PPT presentation will be decompressed
into a 32-bit RGB bitmap, and stored that way in the file.

This holds even if you resize the source image to a small thumbnail in
your presentation.

This one app is responsible for 10-20% of _all_ file space on most of
our file server volumes. :-(

terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: Bill Todd on 25 Sep 2006 17:39

Eric P. wrote:
> Bill Todd wrote:
>> That's a start, anyway: I'd be interested to hear what others think
>
> - DeleteOnClose file attribute. Also gets deleted during crash recovery.
>
> - Anonymous temporary files: Temporary files do not require a
> file name. Anonymous files are automatically marked DeleteOnClose.

Ah - you can always spot a VMS user. As I noted elsewhere, I was trying
to focus at a somewhat higher level, but specific features like these
are also of interest to me.

>
> - EraseOnDelete to overwrite file contents. This is best done in the
> file system, particularly wrt compressed files whose actual physical
> footprint is data dependent and therefore hard to overwrite [*].
> A simple overwrite won't keep out a national security service,
> but could help keep average prying eyes out of financial,
> corporate and personal data.

I've lost my enthusiasm for trying to erase data: it's just too
expensive to do properly - especially if the file system may move data
around a lot (see WAFL, ZFS, or any log-structured file system for
examples of why this might be useful), in which case it would have to
perform multi-pass secure erases every time data moved, not just on delete.

Given how inexpensive processing power is these days, using secure
encryption to protect data seems much more sensible to me - and it
protects the data while it's still live, as well. Anyone *really*
concerned about security after data is no longer in use tends to do
nasty, physical things to the drives it was stored on anyway.

>
> - [*] For disk file system at least, I think the time for compressed
> files has passed and are a waste of development time these days.

The larger the file, the less I'm inclined to agree with you.

>
> - Multi-disk volumes are also a feature I think whose time has passed
> and would waste development time, but mention them anyway.

All contemporary forms of disk arrays certainly qualify as 'multi-disk
volumes', but perhaps you're referring to VMS's rather antiquated
concept of a 'volume set' - disks concatenated much like tapes to form a
single (e.g., backup) volume - in which case I agree.

>
> - I haven't thought of any real use for sparse files yet, since
> databases do their own internal management anyway, so I might
> consider also classifying this as a waste development time.

Z would not: while they may seldom provide an optimal solution for a
problem, they often provide a simple one that performs well enough to be
useful.

>
> - Automatic extend/contract of MFT/inode space, without limits and
> without need to pre allocate space or otherwise manually manage it.

That's implementation detail. I think what you're really saying is you
don't want the user to have to worry about managing the system.

>
> - Built in safe defrag ability, including for directory and mft files.

Users shouldn't have to worry about that, either.

>
> - SoftLinks (vms logical names) and HardLinks.

VMS logical names offer far more comprehensive mechanisms than soft
links, but really work above the file system level (though very few
people deal directly with the VMS file system but instead go through
RMS, which has an intimate and in some ways squalid relationship with
logical names).

One way to view the issue is that VMS logical names aren't at all
limited to file system use, hence an equivalent facility cannot by
definition be implemented in a multi-platform file system but must be
implemented in each individual platform as an operating system feature
(else any application dependent upon them could only run on a platform
that included that file system).

Another way to understand the problem is that VMS logical names depend
upon the client context in which they are resolved, whereas file system
objects do not (save for access controls to them). Incorporating
VMS-style logical names into a file system intended to support
heterogeneous operating system environments does not strike me as being
desirable: any client systems that find them useful can easily
implement them as a pre-parsing step (as indeed RMS does) before
submitting the resulting path to the file system (and if symbolic links
are handled as 'reparse points' which allow the client to massage the
returned path, then it can apply logical translations there if it wants to).

>
> - Async operations for Open, Close and directory lookups as well as
> Read and Write. Open and Close can require many internal IOs to
> accomplish and be very lengthily operations, especially over
> networks, and that stalls servers and GUI's.

Asynchrony used to be the only mechanism to address such issues on VMS,
but IIRC for well over a decade it has supported kernel threads which
can do so in a less arcane manner. So while I have a general
inclination to allow asynchrony wherever possible, I don't see your
examples as proving any pressing need for it.

>
> - Separate the FileSize from EndOfFile attribute.
> I always liked VMS's InitialSize, ExtendSize, MaximumSize
> file attributes for cutting down fragmentation.

Better not to have to worry about such things at all (though initial
allocation may be marginally useful for things like copy operations).

>
> - File placement attributes (outer edge, inner edge, center, etc)

See previous comment.

>
> - I have been pondering the idea of FIFO files that have a
> FrontOfFile marker as well as an EndOfFile marker.
> Efficient for store and forward inter process messaging but
> I'm not sure if it would be useful enough to warrant support.

'Circular' log files are the obvious file-related application, though
the facility may fall out from generic sparse file support (e.g., if you
can deallocate regions of the file).

>
> - To copy files between two remote systems, send a message from
> A to B telling it to send to C, or to C telling it to pull from B.
> Don't read the blocks from B to A and write them out to C.

That's a can of worms that I have no interest in opening, given how easy
it is to submit a remote command-level request to accomplish it.

>
> - KISS: it would be nice to use the same file system for lots of
> devices, from hard drives to DVD to memory sticks to holocrystals.
> So please don't put a full blown friggen RDB inside the file system
>

From: Eric P. on 25 Sep 2006 17:53

Terje Mathisen wrote:
>
> Tarjei T. Jensen wrote:
> > Niels Jørgen Kruse wrote:
> >> File formats are usually compressed already, and you need to know the
> >> kind of content to get the best compression.
> >
> > Sorry, they are not yet compressed. It does not mean that we should not
> > prepare the file system to handle that.
> >
> > We'll have to see whether future word processing and spreadsheet formats
> > are compressed enough natively.
>
> I'd be quite happy if just one single app would stop storing all it's
> data pessimally:
>
> Microsoft Powerpoint.
>
> Any jpeg images you include in a PPT presentation will be decompressed
> into a 32-bit RGB bitmap, and stored that way in the file.
>
> This holds even if you resize the source image to a small thumbnail in
> your presentation.
>
> This one app is responsible for 10-20% of _all_ file space on most of
> our file server volumes. :-(

Ok, well maybe compression still has a place. Of course you realize
that some peoples ability to do dumbass things far exceeds others
ability to compensate by adding compression. Would the compression
algorithm the file system uses work well on 32 bit RGB bitmaps?

I have never built a file system, but it seems to me that the problem
with file compression is that a write in the middle of the file
will be recompressed and can cause changes to the files' physical
block mappings and meta data structures. This in turn updates file
system block allocation tables and meta transaction logs.

With normal non-compressed files this only happens when the file is
extended. With compressed files every write operation can do this
and could bog down the whole system by hammering these critical
common data structures.

It also serializes all file operations while the meta data is being
diddled. However until you read the current data, decompress, update
new data and recompress, you cannot tell whether the compressed
buffer will expand or contract, and what mapping changes are needed.
If the file meta structure is affected it forces all operations
to serialize unless you want to go for a concurrent b+tree update
mechanism which is probably an order of magnitude more complicated.

So if compressed file write operations were limited to append only
then it this feature would not add much complexity or concurrency
problems. Otherwise it starts to look nasty.

Eric

First | Prev | Next | Last
Pages: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Prev: "Livermore Loops" on x86 Linux
Next: How Many Processor Cores Are Enough?