From: Richard Smith on 10 Nov 2009 04:26 I would be interested to know the opinion of this newsgroup on whether it is sensible using Boost.Spirit in production code. In a project I'm working on, I'm likely to need to produce three non-trivial C++ parsers: one for a network protocol quite similar in structure to HTTP, one for a expression parser (not unlike the language used by the Unix tool, bc, but for domain-specific data-types), and one for a particular SGML language (considerably simpler than XML, assuming there are no unanticipated complications). All of these components are things for which I would be comfortable implementing hand-crafted parsers, but equally if there are better ways of generating moderately efficient and, critically, easily maintainable parsers, I would be keen to use them. Boost.Spirit seems to be one of the more obvious possibilities. However, having experimented with Boost.Spirit a bit, I have a number of concerns about its appropriateness for use in production code and I would be interested in others' opinions. * Documentation. Given the size of the library, its documentation is really fairly lightweight and I've invariably found myself reading the code to find out how things work. Just to take two examples, where is it documented what characters alpha_p matches, and where is it documented which headers I should include to use it? * Compile times. Perhaps there are implementation techniques that I'm missing, but most of the non-trivial examples I've experimented with take serious long times to compile. In one case, over an hour for a single translation unit. I prefer to work with a rapid modify- recompile-test development cycle, but I don't see that being feasible if I use significant Boost.Spirit components. * Error messages. Introduce an error into the code and, frankly, the resulting verbiage emitted by the compiler is utterly impenetrable. This is, of course, true of many complex template libraries in C++, and maybe when C++ (eventually) gains concepts, it will improve. But it doesn't help with today's language. * Poor IOStream interoperability. There are two aspects here. First, it would be nice if, when I produced an LALR(1) parser, it would work with InputIterators without my needing to adapt them with multi_pass. (Admittedly, I'm not sure exactly how that could work as I cannot see how the compiler can work out at compile time whether the grammar is LALR(1).) Careless buffering by multi_pass could easily kill one of the applications I have in mind. Secondly, it would be nice if there were some easy way to keep input and output in sync, if not by having a single function that does both (in simple cases, I've seen the % operator overloaded to reasonable effect to implement both << and >>), then by having similar-looking input and output functions that leave it easy to verify by eye their compatibility. Maybe that's something I can still build on top of Boost.Spirit, but that sounds a daunting dask. However, in other ways, I like the look of Spirit. The BNF-form of the code is much closer to the specification I'm working to -- this sounds like a good way of making sure the two stay in sync as the underlying specifications evolve (which I expect them to do). To my pleasant surprise, the object code produced by Spirit is concise and efficient. And I'm sure that as I get more familiar with it, I'll get better at writing correct code faster. Looking around, I also see many quite positive comments about it. So, what is the opinion here? Is it worth pursuing Boost.Spirit? -- Richard Smith -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Chris Morley on 11 Nov 2009 07:59 > So, what is the opinion here? Is it worth pursuing Boost.Spirit? I would definitely recommend you use machine written parsers and not hand written ones, even if the language is pretty trivial. The grammar files are much easier to review at a later date than c++ when you look at the system 6,12,24 months time. You are quite right that the grammar is essentially documentation. I've not used Boost.Spirit, in the intro they say it is simpler than BISON or ANTLR. I wonder what functions you might need down the line which the likes of Bison/Antlr have... I use Bison and it is actually very easy to use with either a machine written scanner (e.g. Flex) or hand written. Bison doesn't do C++ very well (its roots are C) but the skeleton is ok and easy enough to handle. Building a parse tree or running semantic actions directly in the grammar is equally simple. Now I know it, I don't think Bison is overkill for even the most trivial parsers - in fact I wish I'd used Flex too for some trivial scanners despite fact it doesn't play brilliantly with C++. To answer you concerns about Spirit with my Bison experience... (i'm sure most would apply to antlr too) > * Documentation There is stacks on the internet & paper books. Anything related to YACC is applicable too. comp.compilers users are very helpful if you truly get stuck with your grammar. In fact you can often find antlr/yacc/bison grammars on the internet for common languages/formats. >* Compile times Fast. Set up the dependency in the project, bison turns the .y/.yy grammar into C/C++ quickly & compile/link as normal. Matter of seconds for Bison to parse a C complexity grammar on a modern machine. >* Error messages Basic errors are easy enough but bugs in your grammar can take a bit of learning. The debug output is there though to find out what you did! Shift/reduce & reduce/reduce conflicts etc. not complicated to learn about but can be confusing if you're new to e.g. LALR(1) parsing. (sounds like you aren't though) As with most things there are newbie pitfalls. > * Poor IOStream interoperability Roll your own scanner & interface how you like. Bison eats tokens & will reduce what is can based on the input upto that point - so it is in sync. Even use different scanners at runtime (wide char support? no problem!). Parsing human input (e.g. calculator) line by line, no problem. So my advise boils down to: Yes definitely use a machine parser. Don't worry if it seems overkill as if your "specifications evolve" as you suggest they might you may save yourself a lot of effort later on. Bison/Antlr are well established and well used, fast and reliable. I'd suggest you use one of those two (or similar) - sounds like you already have too many question marks about Spirit for your application. Chris -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Joe on 12 Nov 2009 13:59 On Nov 10, 3:26 pm, Richard Smith <rich...(a)ex-parrot.com> wrote: > I would be interested to know the opinion of this newsgroup on whether > it is sensible using Boost.Spirit in production code. I have had the same questions, but unlike you have not tested the library yet. I am very interest in the results of the thread. BTW, was your review of Spirit based upon the Version 2.x of the library that i think is about to be release in the next version of boost. It is suppose to be much better. Also have you look at the Boost library, Xpressive. Joe -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: SeanW on 13 Nov 2009 03:42 On Nov 10, 4:26 pm, Richard Smith <rich...(a)ex-parrot.com> wrote: > So, what is the opinion here? Is it worth pursuing Boost.Spirit? I think Spirit is very slick, but decided against using it when I considered what would happen if I got into trouble. It's one thing to see a 100KB error message and try to sort it into one of a few categories as Jeff Flinn says above, but what if you've got to actually stick your hand in that toilet with a debugger when you have some problem in the field? I couldn't bear the thought, so went with one of the old-school parser generators. Sean -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: CornedBee on 13 Nov 2009 03:41 On Nov 10, 10:26 pm, Richard Smith <rich...(a)ex-parrot.com> wrote: > > However, having experimented with Boost.Spirit a bit, I have a number > of concerns about its appropriateness for use in production code and I > would be interested in others' opinions. > > * Documentation. Given the size of the library, its documentation is > really fairly lightweight and I've invariably found myself reading the > code to find out how things work. Just to take two examples, where is > it documented what characters alpha_p matches, and where is it > documented which headers I should include to use it? I agree about the headers, but other than that I found the docs to be quite good. > > * Compile times. Perhaps there are implementation techniques that I'm > missing, but most of the non-trivial examples I've experimented with > take serious long times to compile. In one case, over an hour for a > single translation unit. I prefer to work with a rapid modify- > recompile-test development cycle, but I don't see that being feasible > if I use significant Boost.Spirit components. If you're using GCC, upgrade to 4.4. It should have greatly increased the speed here. But yes, Spirit is a very metaprogramming-heavy library and takes a long time to compile. You should make sure that you separate spirit parsers into their own source files. > > * Error messages. Introduce an error into the code and, frankly, the > resulting verbiage emitted by the compiler is utterly impenetrable. > This is, of course, true of many complex template libraries in C++, > and maybe when C++ (eventually) gains concepts, it will improve. But > it doesn't help with today's language. Yes. It's the fate of any template library in C++. > Secondly, it would be nice if there > were some easy way to keep input and output in sync, if not by having > a single function that does both (in simple cases, I've seen the % > operator overloaded to reasonable effect to implement both << and >>), > then by having similar-looking input and output functions that leave > it easy to verify by eye their compatibility. Maybe that's something > I can still build on top of Boost.Spirit, but that sounds a daunting > dask. Spirit 2 contains Karma and Qi, one for producing output, the other for parsing. They use extremely similar syntax specifications. Sebastian -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
|
Next
|
Last
Pages: 1 2 Prev: Function template specialization Next: Seeking recommendation on free or cheap C++ compiler |