From: Mihai N. on 22 May 2010 02:51 > German Telephone Book sorting (raw sorting, such > as you might get with strcmp, is not correct, because it is based on the > old ASCII-7 > translation, where ��� followed Z because in German ASCII-7 It kind of shows that the system is old and was adapted for primitive electronic processing. Another hint of that is the fact that the umlaut vowels are equivalent to the ?e doubles (� == ae, � == oe, � == ue) -- Mihai Nita [Microsoft MVP, Visual C++] http://www.mihai-nita.net ------------------------------------------ Replace _year_ with _ to get the real email
From: Joseph M. Newcomer on 22 May 2010 05:57 See below... On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott <NoSpam(a)OCR4Screen.com> wrote: >On 5/21/2010 2:11 PM, Joseph M. Newcomer wrote: >> See below... >> On Thu, 20 May 2010 20:20:24 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote: >> >>> A more accurate statement might be something like unmeasured performance >>> estimates are most often very inaccurate. It is also probably true that >>> faster methods can often be discerned from much slower (at least an >>> order of magnitude) methods without measurement. >> **** >> You are still confusing design and implementation. >> **** > >The way that I do design I start with broad goals that I want to achieve >and end up with nearly correct code as my most detailed level of design. >I progress from the broad goals through very many levels of increasing >specificity using a hierarchy of increasing specificity. **** But I can achieve correct code for a given design that is substantially slower, so I fail to see how a design specifies the performance. **** > >So I am not confusing design with implementation, implementation is the >most detailed level of design within a continuum of increasing >specificity from broad goals to working code. > >Only about 3% of my time is spent on debugging, with another 2% on >testing. The quickest way to complete any very complex system is to slow >down and carefully plan every single step. **** What does that have to do with the issue of non-executable design specifications being "faster" than alternative design specifications. The specification of a lexer is: looking at the current state and the current character, determine a next state. Place the state machine in that state, advance to the next input character, and repeat until an error state or final state is achieved. I fail to see how performance can be specified in this abstract specification. I could use a CMap, virtual methods of classes, or linear table lookup to implement code that met the specification exactly, and there would be a wide range of performance variation in these different realizations. Yet every one of them would meet the specification of a DFA. If your specification starts talking about table layouts in memory, it is no longer a specification, but an implementation discussion. joe **** Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 22 May 2010 06:02 The ae, oe, and ue variants are transliterations into alphabets that do not support accented characters. They were common when all we had was American ASCII-7 and needed to encode German names (not to mention Dutch, Norwegian, Swedish, Russian, etc. names). Did you know that Tchaikovsky and Chebychev have names that sort nearby in Russian? The Tchai and the Che represent the contemporary English transliterations of their names at the times they were working. My maternal great-grandfather was M�ller. But we have documents that spell his name (typewritten) as Mueller, and my grandmother's name was "Miller". joe On Fri, 21 May 2010 23:51:22 -0700, "Mihai N." <nmihai_year_2000(a)yahoo.com> wrote: > >> German Telephone Book sorting (raw sorting, such >> as you might get with strcmp, is not correct, because it is based on the >> old ASCII-7 >> translation, where ? followed Z because in German ASCII-7 > >It kind of shows that the system is old and was adapted for primitive >electronic processing. >Another hint of that is the fact that the umlaut vowels are equivalent >to the ?e doubles (?= ae, ? oe, ?e) Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 22 May 2010 09:50 On 5/22/2010 4:57 AM, Joseph M. Newcomer wrote: > See below... > On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote: > >> The way that I do design I start with broad goals that I want to achieve >> and end up with nearly correct code as my most detailed level of design. >> I progress from the broad goals through very many levels of increasing >> specificity using a hierarchy of increasing specificity. > **** > But I can achieve correct code for a given design that is substantially slower, so I fail > to see how a design specifies the performance. You can't achieve correct code that implements one of my designs that is substantially slower because my final design is code. The design that I posted did not leave enough leeway to really screw it up unless very creative thought was put into screwing it up intentionality. > **** >> >> So I am not confusing design with implementation, implementation is the >> most detailed level of design within a continuum of increasing >> specificity from broad goals to working code. >> >> Only about 3% of my time is spent on debugging, with another 2% on >> testing. The quickest way to complete any very complex system is to slow >> down and carefully plan every single step. > **** > What does that have to do with the issue of non-executable design specifications being > "faster" than alternative design specifications. > > The specification of a lexer is: looking at the current state and the current character, > determine a next state. Place the state machine in that state, advance to the next input > character, and repeat until an error state or final state is achieved. > > I fail to see how performance can be specified in this abstract specification. I could > use a CMap, virtual methods of classes, or linear table lookup to implement code that met > the specification exactly, and there would be a wide range of performance variation in I specified a switch statement and a state transition table in the design. I specified using twelve ActionCodes in a switch statement and provided the ActionCodes. I also provided eight states to be used in a state transition matrix and provided these states and the input values within these states and the corresponding actions for each input value. State 0 00-7F ASCII C2-DF goto State 1 // Two Byte E0-EF goto State 2 // Three Byte F0-F4 goto State 4 // Four Byte else Error State 1 80-BF else Error State 2 80-BF goto State 3 else Error State 3 80-BF else Error State 4 80-BF goto State 5 else Error State 5 80-BF goto State 6 else Error State 6 80-BF goto State 7 else Error State 7 80-BF else Error // Holds ActionCodes Indexed by NextState and Data[N] uint8 States[256][8]; // This is the input data to be transformed std::vector<uint8> Data; // LastByte hold sentinel value 11 Twelve ActionCodes 00 InvalidByteError 01 FirstByteOfOneByte 02 FirstByteOfTwoBytes 03 FirstByteOfThreeBytes 04 FirstByteOfFourBytes 05 SecondByteOfTwoBytes 06 SecondByteOfThreeBytes 07 SecondByteOfFourBytes 08 ThirdByteOfThreeBytes 09 ThirdByteOfFourBytes 10 FourthByteOfFourBytes 11 OutOfData (Sentinel) Within the context of this design there are few correct implementations. > these different realizations. These different realizations would not form correct examples of the design that I specified. > Yet every one of them would meet the specification of a > DFA. If your specification starts talking about table layouts in memory, it is no longer > a specification, but an implementation discussion. > joe > **** > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 22 May 2010 12:39
See below... On Sat, 22 May 2010 08:50:17 -0500, Peter Olcott <NoSpam(a)OCR4Screen.com> wrote: >On 5/22/2010 4:57 AM, Joseph M. Newcomer wrote: >> See below... >> On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote: >> > >>> The way that I do design I start with broad goals that I want to achieve >>> and end up with nearly correct code as my most detailed level of design. >>> I progress from the broad goals through very many levels of increasing >>> specificity using a hierarchy of increasing specificity. >> **** >> But I can achieve correct code for a given design that is substantially slower, so I fail >> to see how a design specifies the performance. > >You can't achieve correct code that implements one of my designs that is >substantially slower because my final design is code. The design that I >posted did not leave enough leeway to really screw it up unless very >creative thought was put into screwing it up intentionality. *** If it is code, it is not design, it is implementation. joe *** > >> **** >>> >>> So I am not confusing design with implementation, implementation is the >>> most detailed level of design within a continuum of increasing >>> specificity from broad goals to working code. >>> >>> Only about 3% of my time is spent on debugging, with another 2% on >>> testing. The quickest way to complete any very complex system is to slow >>> down and carefully plan every single step. >> **** >> What does that have to do with the issue of non-executable design specifications being >> "faster" than alternative design specifications. >> >> The specification of a lexer is: looking at the current state and the current character, >> determine a next state. Place the state machine in that state, advance to the next input >> character, and repeat until an error state or final state is achieved. >> >> I fail to see how performance can be specified in this abstract specification. I could >> use a CMap, virtual methods of classes, or linear table lookup to implement code that met >> the specification exactly, and there would be a wide range of performance variation in > >I specified a switch statement and a state transition table in the design. > >I specified using twelve ActionCodes in a switch statement and provided >the ActionCodes. I also provided eight states to be used in a state >transition matrix and provided these states and the input values within >these states and the corresponding actions for each input value. > >State 0 > 00-7F ASCII > C2-DF goto State 1 // Two Byte > E0-EF goto State 2 // Three Byte > F0-F4 goto State 4 // Four Byte > else Error >State 1 > 80-BF > else Error >State 2 > 80-BF goto State 3 > else Error >State 3 > 80-BF > else Error >State 4 > 80-BF goto State 5 > else Error >State 5 > 80-BF goto State 6 > else Error >State 6 > 80-BF goto State 7 > else Error >State 7 > 80-BF > else Error > >// Holds ActionCodes Indexed by NextState and Data[N] >uint8 States[256][8]; > >// This is the input data to be transformed >std::vector<uint8> Data; // LastByte hold sentinel value 11 > >Twelve ActionCodes >00 InvalidByteError >01 FirstByteOfOneByte >02 FirstByteOfTwoBytes >03 FirstByteOfThreeBytes >04 FirstByteOfFourBytes >05 SecondByteOfTwoBytes >06 SecondByteOfThreeBytes >07 SecondByteOfFourBytes >08 ThirdByteOfThreeBytes >09 ThirdByteOfFourBytes >10 FourthByteOfFourBytes >11 OutOfData (Sentinel) > >Within the context of this design there are few correct implementations. > >> these different realizations. > >These different realizations would not form correct examples of the >design that I specified. > >> Yet every one of them would meet the specification of a >> DFA. If your specification starts talking about table layouts in memory, it is no longer >> a specification, but an implementation discussion. >> joe >> **** >> Joseph M. Newcomer [MVP] >> email: newcomer(a)flounder.com >> Web: http://www.flounder.com >> MVP Tips: http://www.flounder.com/mvp_tips.htm Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm |