From: Nick Maclaren on 29 Sep 2006 07:43 In article <4o4ed5Fd0ghrU1(a)individual.net>, =?ISO-8859-1?Q?Jan_Vorbr=FCggen?= <jvorbrueggen(a)not-mediasec.de> writes: |> > Since the source code of the language in question is case sensitive, |> |> ...which is already a massive design failure, of course, ... |> |> > two of those three are just wrong, |> |> Really? Is there anything in the C ISO standard that says that the |> statement "include stdio.h" must lead to a file names "stdio.h", and not |> "STDIO.H", or even an entry "STDIO" in a text-library file? Er, do you mean '#include <stdio.h>' or '#include "stdio.h"'? In the first case, no. In the second, yes, though the specification is a typical piece of ISO C ambiguity. What is says is: [#3] A preprocessing directive of the form # include "q-char-sequence" new-line causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. ... The C standard does not actually SAY that it is undefined behaviour if '#include "stdio.h"' matches a user file, but it assuredly is. What is unclear is whether it is ALSO undefined behaviour if it matches the same file matched by '#include <stdio.h>' - I have used systems where it was. But this arcana is off-group .... Regards, Nick Maclaren.
From: Jan Vorbrüggen on 29 Sep 2006 08:19 > |> Really? Is there anything in the C ISO standard that says that the > |> statement "include stdio.h" must lead to a file names "stdio.h", and not > |> "STDIO.H", or even an entry "STDIO" in a text-library file? > > Er, do you mean '#include <stdio.h>' or '#include "stdio.h"'? > > In the first case, no. In the second, yes, though the specification > is a typical piece of ISO C ambiguity. What is says is: > > [#3] A preprocessing directive of the form > > # include "q-char-sequence" new-line > > causes the replacement of that directive by the entire > contents of the source file identified by the specified > sequence between the " delimiters. The named source file is > searched for in an implementation-defined manner. ... So the answer is "no" even in the second case. Notice the careful wording, "the..._contents_ of the source file _identified_ by the specified sequence". Is there anything that says, "if you write '#include STDIO.H', you will get something different from 'include stdio.h'"? Jan
From: Nick Maclaren on 29 Sep 2006 08:22 In article <4o4h89Fd1dnkU1(a)individual.net>, =?ISO-8859-1?Q?Jan_Vorbr=FCggen?= <jvorbrueggen(a)not-mediasec.de> writes: |> |> So the answer is "no" even in the second case. Notice the careful wording, |> "the..._contents_ of the source file _identified_ by the specified sequence". Well, yes, the contents could be anything. |> Is there anything that says, "if you write '#include STDIO.H', you will get |> something different from 'include stdio.h'"? No. You MUST get the same - a syntax error. If you mean '#include "STDIO.H"' and '#include "stdio.h"', then no. Regards, Nick Maclaren.
From: Bill Todd on 29 Sep 2006 12:02 Benny Amorsen wrote: >>>>>> "BT" == Bill Todd <billtodd(a)metrocast.net> writes: > > BT> Does this suggest that the file system should collate > BT> case-insensitive even while it addresses case-sensitive, so that > BT> such potential collisions can be easily found? > > What do you mean by collate here? Precisely what I said, as I usually do. If you say that the results of > readdir() or equivalent should be returned in proper alphabetical > order, you really have to make that order per-user. My girlfriend and > I expect different collations. I am not interested in your collating preferences, or in your girlfriend's. I was asking whether a file system should collate insensitively to case in order to facilitate detecting unintended logical collisions created by case-sensitive names (with the implicit assumption that these might be sufficiently frequent - as contrasted with other conceivable forms of character-related collisions - to be worth addressing). - bill
From: Anne & Lynn Wheeler on 2 Oct 2006 13:53
dgay(a)barnowl.research.intel-research.net writes: > I think you missed the point that the output of readdir is (and should > be) unrelated to the order presented to the user. Why is the file system > collating anyway? Now I can see the value of a library that collates > file names according to some system-wide convention... one of the results of changes original made by (i think ?) perkin/elmer to cms mfd in the early 70s was to sort the filenames. then when application was looking for specific filename .... lookup could do much better than linear search (and searches better than linear were dependent on being matched to collating/sort sequences). it really was significant change for directories that happened to have a couple thousand filenames (some number of high use system). i recently ran into something similar using sort on filenames and doing something other than linear search ... where sort command default collating sequence changed and it moved how period was handled (showed up betwee capital H and capital I). i had to explicitly set "LC_ALL=C" to get sort back working the way i was use to. a similar, but different problem we did long ago and far away ... when we did online telephone book for several thousand corporate employees. for lots of reasons ... the names/numbers was kept in linear flat file .... but sorted. the search was radix ... based on measured first letter frequency by taking the size of the file and probing part way into the file based on first letters of the search argument and the related letter frequencies for names (originally compiled into the search program). it could frequently get within appropriate physical record within a probe or two (w/o requiring separate index or other infrastructure). we had special collating/sort order assuming that names (and search arguments) had no blanks (even tho any names with embedded blanks were carried in the actual data (the ignore blanks was a special sort charactieristic/option). in the name scenario .. name collisions/duplicates were allowed ... so search result might present multiple matches. |