Prev: Does anyone copyright or patent their applications?
Next: Designing a Finite State Machine DFA Recognizer for UTF-8
From: Peter Olcott on 19 May 2010 15:48 On 5/19/2010 2:24 PM, Leigh Johnston wrote: > > > "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message > news:69ydnUm-AOC_pWnWnZ2dnUVZ_u2dnZ2d(a)giganews.com... >> On 5/19/2010 1:42 PM, Leigh Johnston wrote: >>> >>> >>> "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message >>> news:P6idnX4azPfvs2nWnZ2dnUVZ_oudnZ2d(a)giganews.com... >>>>> >>>>> Whilst what you say is technically correct I try to avoid writing code >>>>> which does not check against an end iterator when iterating over a >>>>> sequence, just personal preference (due to a slight concern re >>>>> safety). >>>>> We are probably only talking about an extra CPU instruction or two to >>>>> check for end of sequence in the main loop along with the O(1) >>>>> check of >>>>> the final state when the main loop is exited. Your solution would also >>>>> require making a copy of the input sequence to allow appending of the >>>>> sentinel unless you consider mutating input parameters to be OK. My >>>> >>>> The main purpose of this is to read in a file of UTF-8 to be converted >>>> to UTF-32. I don't have to mutate the input at all, the user must know >>>> to append the 0xFF byte. >>> >>> Are you for real? That sounds like a really stupid idea. >> >> The goal is to make the fastest possible validation of UTF-8 and >> translation to UTF-32. Within this binding contsraint there are few >> options. Copying the input data is not one of them. What else does >> that leave? Mutating the Input and then changing it back? >> > > Either you are holding the entire file in memory or performing a > buffered read, either way you can append the sentinel to the data in > memory unless you are using memory mapped I/O. The only use-case that > benefits from having a sentinel is if the input is in memory and you > have indicated this is not a primary use-case so why bother with a > sentinel at all? When performing file I/O your algorithm is unlikely to > be the bottleneck sentinel or no sentinel. As I would not use a sentinel > for this I would not have the dilemma of mutating the input that you > face and it would work for any use-case (input in a file, network or > memory). > > /Leigh Again you forget the primary purpose of this whole line-of-reasoning. The goal is to show that it is not possible to construct a faster lexer than the one based on a state transition matrix.
From: Leigh Johnston on 19 May 2010 15:51 "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message news:W_ydnU32Ptzn3WnWnZ2dnUVZ_qgAAAAA(a)giganews.com... > > Again you forget the primary purpose of this whole line-of-reasoning. The > goal is to show that it is not possible to construct a faster lexer than > the one based on a state transition matrix. This contradicts what you said earlier, i.e.: >>>> The main purpose of this is to read in a file of UTF-8 to be converted >>>> to UTF-32. I don't have to mutate the input at all, the user must know >>>> to append the 0xFF byte. What is the difference between "primary purpose" and "main purpose"? I give up, your replies are too troll-like whether intentionally or not. /Leigh
From: Paul Bibbings on 19 May 2010 15:54 Peter Olcott <NoSpam(a)OCR4Screen.com> writes: > Again you forget the primary purpose of this whole > line-of-reasoning. The goal is to show that it is not possible to > construct a faster lexer than the one based on a state transition > matrix. Will *this* get a response too? *Any* response? Really?
From: Peter Olcott on 19 May 2010 16:01 On 5/19/2010 2:51 PM, Leigh Johnston wrote: > > > "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message > news:W_ydnU32Ptzn3WnWnZ2dnUVZ_qgAAAAA(a)giganews.com... >> >> Again you forget the primary purpose of this whole line-of-reasoning. >> The goal is to show that it is not possible to construct a faster >> lexer than the one based on a state transition matrix. > > This contradicts what you said earlier, i.e.: > >>>>> The main purpose of this is to read in a file of UTF-8 to be converted >>>>> to UTF-32. I don't have to mutate the input at all, the user must know >>>>> to append the 0xFF byte. > > What is the difference between "primary purpose" and "main purpose"? > > I give up, your replies are too troll-like whether intentionally or not. > > /Leigh Me too.
From: Hector Santos on 19 May 2010 16:39 Paul Bibbings wrote: > Peter Olcott <NoSpam(a)OCR4Screen.com> writes: > >> Again you forget the primary purpose of this whole >> line-of-reasoning. The goal is to show that it is not possible to >> construct a faster lexer than the one based on a state transition >> matrix. > > Will *this* get a response too? *Any* response? Really? Hope not.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Does anyone copyright or patent their applications? Next: Designing a Finite State Machine DFA Recognizer for UTF-8 |