From: Tech Id on
Hi,

I need to read a configuration file and create an Object Model in
memory.
The configuration file uses around 7-8 keywords and symbols like '(',
'[', ']' and ')' and quoted strings.

My question is:
Should I use a simple tokenizer to parse the above file or use some
kind of parser?
If parser is recommended, which one should be best?
ANTLR?
Flex/Bison?

There is not much need of error recovery since the file is produced by
another tool and rarely edited by hand.

Thanks in advance for help.
From: Martin Gregorie on
On Thu, 13 May 2010 01:10:28 -0700, Tech Id wrote:

> Hi,
>
> I need to read a configuration file and create an Object Model in
> memory.
> The configuration file uses around 7-8 keywords and symbols like '(',
> '[', ']' and ')' and quoted strings.
>
> My question is:
> Should I use a simple tokenizer to parse the above file or use some kind
> of parser?
> If parser is recommended, which one should be best? ANTLR?
> Flex/Bison?
>
Coco/R is another Java parser generator - written in Java and generates
Java though there are other flavours for other languages.

I found it pretty easy to use: unlike Flex/Bison all its input is in one
file and its also easy to edit the framework files as I discovered when I
needed to make it parse source in a string.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
From: Tom Anderson on
On Thu, 13 May 2010, Tech Id wrote:

> I need to read a configuration file and create an Object Model in
> memory. The configuration file uses around 7-8 keywords and symbols like
> '(', '[', ']' and ')' and quoted strings.
>
> My question is: Should I use a simple tokenizer to parse the above file
> or use some kind of parser?

Impossible to say without knowing more about the language. Can you give us
an example?

> If parser is recommended, which one should be best? ANTLR? Flex/Bison?

I like JavaCC:

https://javacc.dev.java.net/

Partly because i prefer LL(k) to LALR, but that's a matter of taste.
JavaCC certainly makes it very easy to write grammars, and very easy to
hang your custom code off those grammars. It has a facility for building
parse trees (generating code for building them, i think), but i never used
it - i found it easier in the long run to put my code right in the action
blocks. I used something that looks a bit like a Visitor or Builder
pattern as a facade that the parser could talk to.

tom

--
The art of medicine consists in amusing the patient while nature cures
the disease. -- Voltaire
From: Roedy Green on
On Thu, 13 May 2010 01:10:28 -0700 (PDT), Tech Id
<tech.login.id2(a)gmail.com> wrote, quoted or indirectly quoted someone
who said :

>
>There is not much need of error recovery since the file is produced by
>another tool and rarely edited by hand.

See http://mindprod.com/jgloss/parser.html

To make it programmer friendly, so others can use the file, consider
XML, even though it is ugly and fluffy.

http://mindprod.com/jgloss/xml.html
--
Roedy Green Canadian Mind Products
http://mindprod.com

Beauty is our business.
~ Edsger Wybe Dijkstra (born: 1930-05-11 died: 2002-08-06 at age: 72)

Referring to computer science.
From: Arne Vajhøj on
On 13-05-2010 04:10, Tech Id wrote:
> I need to read a configuration file and create an Object Model in
> memory.
> The configuration file uses around 7-8 keywords and symbols like '(',
> '[', ']' and ')' and quoted strings.
>
> My question is:
> Should I use a simple tokenizer to parse the above file or use some
> kind of parser?
> If parser is recommended, which one should be best?
> ANTLR?
> Flex/Bison?
>
> There is not much need of error recovery since the file is produced by
> another tool and rarely edited by hand.

if not too complex then
use regex
else
use parser generator
end if

It is not possible to evaluate the complexity based on the
information given.

In general I would say that if you could get the format
changed to something better supported like a Java
properties file or a XML format that can be mapped
via JAXB, then you would be much better of.

Arne