Prev: Atomic variable Vs. Atomic operation
Next: How to program critical section for reader-writer systems?
From: Peter Olcott on 20 May 2010 00:16 I was wondering if there is a generally faster way to implement a DFA recognizer than by using a state transition matrix. I already asked this question on several other groups and never received any definitive answer.
From: Tim Little on 20 May 2010 05:05 On 2010-05-20, Peter Olcott <NoSpam(a)OCR4Screen.com> wrote: > I was wondering if there is a generally faster way to implement a > DFA recognizer than by using a state transition matrix. I wouldn't expect so, as O(1) per symbol is about the fastest you could reasonably expect. - Tim
From: Torben �gidius Mogensen on 20 May 2010 05:47 "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: > I was wondering if there is a generally faster way to implement a DFA > recognizer than by using a state transition matrix. Not asymptotically faster, since it is linear (assuming matrix lookup is constant time). You can sometimes get faster code by translating the matrix into code (essentially a switch statement per state) instead of using an array, but it depends on the compiler which is faster. Flex does this. You can also halve the number of table lookups if you match two characters at a time. This makes the table enormously larger, though. If you use 8-bit characters, the table size is 256 x states. If you match two characters at a time, it is 65536 x states, i.e., 256 times larger. Since matrix lookup is not constant time on modern machines, this might actually make your code slower. So you can do the converse: Match half a character (4 bits) at a time, which makes your table roughly 32 x states (since you need to insert extra states). This reduces your table by 8, which will more likely make it fit your cache. Obviously, if there are only a few states, it will anyway. Torben
From: Peter Olcott on 20 May 2010 10:14 "Torben �gidius Mogensen" <torbenm(a)diku.dk> wrote in message news:7z7hmygb05.fsf(a)ask.diku.dk... > "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: > >> I was wondering if there is a generally faster way to >> implement a DFA >> recognizer than by using a state transition matrix. > > Not asymptotically faster, since it is linear (assuming > matrix lookup is > constant time). You can sometimes get faster code by > translating the > matrix into code (essentially a switch statement per > state) instead of > using an array, but it depends on the compiler which is > faster. Flex > does this. I use a state transition matrix to look up the switch case value. I can not imagine how to (for example) match 50 different keywords without the table and only use just the switch statement. How is this done? > > You can also halve the number of table lookups if you > match two > characters at a time. This makes the table enormously > larger, though. > If you use 8-bit characters, the table size is 256 x > states. If you > match two characters at a time, it is 65536 x states, > i.e., 256 times > larger. > > Since matrix lookup is not constant time on modern > machines, this might > actually make your code slower. So you can do the > converse: Match half > a character (4 bits) at a time, which makes your table > roughly 32 x > states (since you need to insert extra states). This > reduces your table > by 8, which will more likely make it fit your cache. > Obviously, if > there are only a few states, it will anyway. > > Torben
From: Ben Bacarisse on 20 May 2010 11:38 "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: > "Torben Ægidius Mogensen" <torbenm(a)diku.dk> wrote in message > news:7z7hmygb05.fsf(a)ask.diku.dk... >> "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: >> >>> I was wondering if there is a generally faster way to implement a >>> DFA recognizer than by using a state transition matrix. >> >> Not asymptotically faster, since it is linear (assuming matrix lookup >> is constant time). You can sometimes get faster code by translating >> the matrix into code (essentially a switch statement per state) >> instead of using an array, but it depends on the compiler which is >> faster. Flex does this. > > I use a state transition matrix to look up the switch case > value. I can not imagine how to (for example) match 50 > different keywords without the table and only use just the > switch statement. How is this done? I suspect you may just be using a mixed version that is half table and half switch driven. The fully matrix version looks like this: state := initial_state; while !is_final(state) do state := state_table[state, get_input()]; while the full case version looks like this: state := initial_state; while !is_final(state) do c := get_input() switch state in case 0: switch c in case 'a': state := 1; case 'b': state := 2; end case 1: switch c in case 'a': state := 2; end /* and so on... */ end corresponding to a table in which state_table[0, 'a'] is 1 and state_table[0, 'b'] and state_table[1, 'a'] are 2. (state_table[x, y] is x for all other x and y.) <snip> -- Ben.
|
Next
|
Last
Pages: 1 2 3 Prev: Atomic variable Vs. Atomic operation Next: How to program critical section for reader-writer systems? |