From: Jan Vorbrüggen on 25 Sep 2006 10:41 > Case-blind file systems are a pox, if you've ever had to share code across > filesystems, and your coleagues insist on saving headers in files with one > case, but some different case appears in the source of the include > statement... Say what? That is just the situation a case-blind system is designed to handle gracefully! Jan
From: Jan Vorbrüggen on 25 Sep 2006 10:42 > Yes. There's also annoying things like ligatures and diacritics. And > perhaps many different codepoints that (more or less) share a glyph. How are those in any way relevant? Jan
From: Anne & Lynn Wheeler on 25 Sep 2006 11:52 Bill Todd <billtodd(a)metrocast.net> writes: > But it is indeed a gray area as soon as one introduces the idea of a > CopyFile() operation (that clearly needs to include network copying > to be of general use). The recent introduction of 'bundles' > ('files' that are actually more like directories in terms of > containing a hierarchical multitude of parts - considerably richer > IIRC than IBM's old 'partitioned data sets') as a means of handling > multi-'fork' and/or attribute-enriched files in a manner that simple > file systems can at least store (though applications then need to > understand that form of storage to handle it effectively) may be > applicable here. we had somewhat stumbled across file bundles (based on use, not necessarily file structure organization) in the work that started out doing traces of all record accesses for i/o cache simulation (circa 1980). the strict cache simulation work showed that partitioned caches (aka "local LRU") was always lower performance than global cache (aka "global LRU"). for a fixed amount of electronic storage, a single global system i/o cache always had better thruput than partitioning the same amount of electronic storage between i/o channels, disk controllers, and/or individual disks (modulo a track cache for rotational delay compensation). further work on the full record access traces started to show up some amount of repeated patterns that tended to access the same collection of files. for this collection of data access patterns, rather than disk arm motion with various kinds of distribution ... there was very strong bursty locality. this led down the path of maintaining more detailed information about files and their useage for optimizing thruput (and layout). earlier at the science center http://www.garlic.com/~lynn/subtopic.html#545 we had done detailed page reference traces and cluster analysis in support of semi-automated program reorganization ... which was eventually released as VS/REPACK product. the disk record i/o traces started down the path of doing something similar for filesystem organization/optimization. i had done a backup/archive system that was used internally at a number of locations. this eventually morphed into product called workstation datasave facility and then adsm. it was later renamed tsm (tivoli storage manager). this now supports bundles/containers for file storage management (i.e. collections of files that tend to have bursty locality of reference patterns) http://www.garlic.com/~lynn/subtopic.html#backup some number of other backup/archive and/or (hierarchical) storage management systems now also have similar constructs. some recent posts that mention that i/o cache simulation work http://www.garlic.com/~lynn/2006e.html#45 using 3390 mod-9s http://www.garlic.com/~lynn/2006f.html#0 using 3390 mod-9s http://www.garlic.com/~lynn/2006f.html#18 how much swap size did you take? http://www.garlic.com/~lynn/2006i.html#36 virtual memory http://www.garlic.com/~lynn/2006i.html#41 virtual memory http://www.garlic.com/~lynn/2006j.html#7 virtual memory http://www.garlic.com/~lynn/2006j.html#14 virtual memory http://www.garlic.com/~lynn/2006j.html#27 virtual memory http://www.garlic.com/~lynn/2006l.html#43 One or two CPUs - the pros & cons http://www.garlic.com/~lynn/2006o.html#27 oops http://www.garlic.com/~lynn/2006o.html#68 DASD Response Time (on antique 3390?) http://www.garlic.com/~lynn/2006p.html#0 DASD Response Time (on antique 3390?) some recent posts mentioning vs/repack activity http://www.garlic.com/~lynn/2006b.html#15 {SPAM?} Re: Expanded Storage http://www.garlic.com/~lynn/2006b.html#23 Seeking Info on XDS Sigma 7 APL http://www.garlic.com/~lynn/2006e.html#20 About TLB in lower-level caches http://www.garlic.com/~lynn/2006e.html#46 using 3390 mod-9s http://www.garlic.com/~lynn/2006i.html#37 virtual memory http://www.garlic.com/~lynn/2006j.html#18 virtual memory http://www.garlic.com/~lynn/2006j.html#22 virtual memory http://www.garlic.com/~lynn/2006j.html#24 virtual memory http://www.garlic.com/~lynn/2006l.html#11 virtual memory http://www.garlic.com/~lynn/2006o.html#23 Strobe equivalents http://www.garlic.com/~lynn/2006o.html#26 Cache-Size vs Performance
From: Terje Mathisen on 25 Sep 2006 12:06 Jan Vorbr?ggen wrote: >> BT> Out of curiosity, does anyone know of a good reason why file names >> BT> should *ever* be case-sensitive (aside from the fact that Unix >> BT> users and applications have become used to this)? >> >> Which language do you want to be case-insensitive in? What if two >> users of the same file system disagree on the choice? > > That is not a matter of language. Or is there a character encoding that > says for language A, "X" and "x" are a pair while for language B, "X" and > "y" are a pair? Yes, afaik: The German 'double-s' is two letters in uppercase and a single letter in lowercase. > Case-blind case-preserving is the only variant which is acceptable from the > point of view of ergonomics, IMNSHO. There I agree. This obeys the principle of least surprise, but as noted above, it does still have drawbacks. Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen on 25 Sep 2006 12:14
Andrew Reilly wrote: > On Mon, 25 Sep 2006 12:52:48 +0200, Terje Mathisen wrote: > >> For every N MB of contiguous disk space, use an extra MB to store ECC >> info for the current block. The block size needs to be large enough that >> a local soft spot which straddles two sectors cannot overwhelm the ECC >> coding. > > Isn't that just the same as having the drive manufacturer use longer > reed-solomon (forward error correcting) codes? Errors at that level are No, because you need _huge_ lengths to avoid the problem where an areas of the disk is going bad. This really interferes with random access benchmark numbers. :-( > something that can be dialed-in or out by the manufacturer. If it's too > high for comfort, they'll start to lose sales, won't they? > > Alternative approach to ECC sectors: store files in a fountain code > pattern? Pointers? OK, I found some papers, but they really didn't tell my how/why they would be suited to disk sector recovery. :-( Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching" |