From: Tech Id on 13 May 2010 04:10 Hi, I need to read a configuration file and create an Object Model in memory. The configuration file uses around 7-8 keywords and symbols like '(', '[', ']' and ')' and quoted strings. My question is: Should I use a simple tokenizer to parse the above file or use some kind of parser? If parser is recommended, which one should be best? ANTLR? Flex/Bison? There is not much need of error recovery since the file is produced by another tool and rarely edited by hand. Thanks in advance for help.
From: Martin Gregorie on 13 May 2010 08:13 On Thu, 13 May 2010 01:10:28 -0700, Tech Id wrote: > Hi, > > I need to read a configuration file and create an Object Model in > memory. > The configuration file uses around 7-8 keywords and symbols like '(', > '[', ']' and ')' and quoted strings. > > My question is: > Should I use a simple tokenizer to parse the above file or use some kind > of parser? > If parser is recommended, which one should be best? ANTLR? > Flex/Bison? > Coco/R is another Java parser generator - written in Java and generates Java though there are other flavours for other languages. I found it pretty easy to use: unlike Flex/Bison all its input is in one file and its also easy to edit the framework files as I discovered when I needed to make it parse source in a string. -- martin@ | Martin Gregorie gregorie. | Essex, UK org |
From: Tom Anderson on 13 May 2010 08:40 On Thu, 13 May 2010, Tech Id wrote: > I need to read a configuration file and create an Object Model in > memory. The configuration file uses around 7-8 keywords and symbols like > '(', '[', ']' and ')' and quoted strings. > > My question is: Should I use a simple tokenizer to parse the above file > or use some kind of parser? Impossible to say without knowing more about the language. Can you give us an example? > If parser is recommended, which one should be best? ANTLR? Flex/Bison? I like JavaCC: https://javacc.dev.java.net/ Partly because i prefer LL(k) to LALR, but that's a matter of taste. JavaCC certainly makes it very easy to write grammars, and very easy to hang your custom code off those grammars. It has a facility for building parse trees (generating code for building them, i think), but i never used it - i found it easier in the long run to put my code right in the action blocks. I used something that looks a bit like a Visitor or Builder pattern as a facade that the parser could talk to. tom -- The art of medicine consists in amusing the patient while nature cures the disease. -- Voltaire
From: Roedy Green on 13 May 2010 14:15 On Thu, 13 May 2010 01:10:28 -0700 (PDT), Tech Id <tech.login.id2(a)gmail.com> wrote, quoted or indirectly quoted someone who said : > >There is not much need of error recovery since the file is produced by >another tool and rarely edited by hand. See http://mindprod.com/jgloss/parser.html To make it programmer friendly, so others can use the file, consider XML, even though it is ugly and fluffy. http://mindprod.com/jgloss/xml.html -- Roedy Green Canadian Mind Products http://mindprod.com Beauty is our business. ~ Edsger Wybe Dijkstra (born: 1930-05-11 died: 2002-08-06 at age: 72) Referring to computer science.
From: Arne Vajhøj on 13 May 2010 20:09
On 13-05-2010 04:10, Tech Id wrote: > I need to read a configuration file and create an Object Model in > memory. > The configuration file uses around 7-8 keywords and symbols like '(', > '[', ']' and ')' and quoted strings. > > My question is: > Should I use a simple tokenizer to parse the above file or use some > kind of parser? > If parser is recommended, which one should be best? > ANTLR? > Flex/Bison? > > There is not much need of error recovery since the file is produced by > another tool and rarely edited by hand. if not too complex then use regex else use parser generator end if It is not possible to evaluate the complexity based on the information given. In general I would say that if you could get the format changed to something better supported like a Java properties file or a XML format that can be mapped via JAXB, then you would be much better of. Arne |