Prev: GPRbuild compatibility
Next: Irony?
From: Jeffrey Carter on 12 Aug 2010 14:55 On 08/12/2010 12:53 AM, Ludovic Brenta wrote: > * the parser also supports lists of the form (a b c) (more than 2 > elements) and properly translates them to (a (b c)). The Append() > procedure that does this is also public and available to clients. The example was something like (TCP-something (host abc) (port 42) ) So I think the grammar is S-expression = '(' Element {' ' Element} ')' Element = Atom | S-expression and you've implemented something slightly different. -- Jeff Carter "What I wouldn't give for a large sock with horse manure in it." Annie Hall 42 --- news://freenews.netfront.net/ - complaints: news(a)netfront.net ---
From: Ludovic Brenta on 12 Aug 2010 15:59 Jeffrey Carter <spam.jrcarter.not(a)spam.not.acm.org> writes: > On 08/12/2010 12:53 AM, Ludovic Brenta wrote: >> * the parser also supports lists of the form (a b c) (more than 2 >> elements) and properly translates them to (a (b c)). The Append() >> procedure that does this is also public and available to clients. > > The example was something like > > (TCP-something (host abc) (port 42) ) > > So I think the grammar is > > S-expression = '(' Element {' ' Element} ')' > > Element = Atom | S-expression > > and you've implemented something slightly different. If I understand your quasi-BNF correctly, an S-Expression in your grammar can have only one or two, but not three elements. This contradicts the "(TCP-something (host abc) (port 42) )" example, which has three elements. In reality, an S-Expression (a "cons pair" in Lisp parlance) can indeed not have three elements; the notation (a b c) is, really, shorthand for (a (b c)); I was keen to implement that in my parser. See http://en.wikipedia.org/wiki/S-expression#Definition -- Ludovic Brenta.
From: Natacha Kerensikova on 12 Aug 2010 16:11 On Aug 12, 8:55 pm, Jeffrey Carter <spam.jrcarter....(a)spam.not.acm.org> wrote: > The example was something like > > (TCP-something (host abc) (port 42) ) > > So I think the grammar is > > S-expression = '(' Element {' ' Element} ')' > > Element = Atom | S-expression Actually a S-expression list can be empty, and a S-expression can be a single atom, or a list of atom, without parentheses. The grammar would be more something like: SP = TAB | SPACE | CR | LF -- whitespace S-expression = SP* (Atom | List)* SP* List = '(' S-expression* ')' Atom = Token-Atom | Quoted-String-Atom | Hexadecimal-Atom | Base64- Atom | Verbatim-Atom And the rest I'm not sure can be described by a grammar (especially verbatim encoded atoms), and that's without entering into details like brace encoding or display hints (which sucks anyway). The whitespace being optional because generally atoms don't need separators, token- encoded atoms being the exception (unfortunately all atoms from my example are token-encoded). Hoping it helps, Natacha
From: Ludovic Brenta on 12 Aug 2010 16:22 I wrote on comp.lang.ada: > I made some fixes and I now consider my S-Expression parser[1] feature- > complete as of revision b13ccabbaf227bad264bde323138910751aa2c2b. > There may still be some bugs though, and the error reporting (to > diagnose syntax errors in the input) is very primitive. > > Highlights: > * the procedure S_Expression.Read is a quasi-recursive descent > parser. "Quasi" because it only recurses when encountering an opening > parenthesis, but processes atoms without recursion, in the same finite > state machine. > * the parser reads each character exactly once; there is no push_back > or backtracking involved. This makes the parser suitable to process > standard input on the fly. > * to achive this, I had to resort to using exceptions instead of > backtracking; this happens when the parser encounters a ')' > immediately after an S-Expression (atom or list). > * the parser also supports lists of the form (a b c) (more than 2 > elements) and properly translates them to (a (b c)). The Append() > procedure that does this is also public and available to clients. > * the parser does not handle incomplete input well. If it gets > Ada.IO_Exceptions.End_Error in the middle of an S-Expression, it will > return an incomplete, possibly empty, S-Expression rather than report > the error. I'll try to improve that. > * the test.adb program demonstrates how to construct an S-Expression > tree in memory (using cons()) and then sending it to a stream (using > 'Write). > * the test.adb program also demonstrates how to read an S-Expression > from a stream (using 'Read) and then traverse the in-memory tree > (using car(), cdr()). > > [1] http://green.ada-france.org:8081/branch/changes/org.ludovic-brenta.s_expressions > > I have not yet tested the parser on your proposed input (IIUC, > consisting of two S-Expressions with a missing closing parenthesis). > I think this will trigger the bug where End_Error in the middle of an > S-Expression is not diagnosed. > > I also still need to add the proper GPLv3 license text on each file. > > I'll probably add support for Lisp-style comments (starting with ';' > and ending at end of line) in the future. As of revision b60f80fba074431aeeffd95aa273a1d4fc81bf41, I now handle end-of-stream in all situations and (I believe) react appropriately. I have now tested the parser against this sample input file: $ cat test_input (tcp-connect (host foo.bar) (port 80)) (tcp-connect ((host foo.bar) (port 80)) (tcp-connect (host foo.bar) (port 80))) $ ./test < test_input (tcp-connect ((host foo.example) (port 80))) Parsing the S-Expression: (tcp-connect ((host foo.bar) (port 80))) Writing the S-Expression: (tcp-connect ((host foo.bar) (port 80))) Parsing the S-Expression: Exception name: TEST.SYNTAX_ERROR Message: Expected atom with value 'host' Writing the S-Expression: (tcp-connect ((((host foo.bar) (port 80)) (tcp-connect (host foo.bar))) (port 80))) raised S_EXPRESSION.SYNTAX_ERROR : Found ')' at the start of an expression The very first line of output is the result of 'Write of an S-Expression constructed from a hardcoded TCP_Connect record. "Parsing" refers to the high-level part of the parsing that traverses the in-memory S-Expression tree and converts it to a TCP_Connect_T record. "Writing" refers to both halves of the low-level parsing: reading the character stream, producing the in-memory S-Expression tree, and converting it back to a character stream. The TEST.SYNTAX_ERROR is because the high-level parser found a list instead of the expected atom "host"; this is because of the extra '(' before "host" at line 3 in the input. The S_EXPRESSION.SYNTAX_ERROR is because the low-level parser found an extra ')' at the very end of line 4 in the input; it coalesced lines 3 and 4 into a single, valid, S-Expression, and was expecting a new S-Expression starting with '('. -- Ludovic Brenta.
From: Natacha Kerensikova on 12 Aug 2010 16:23
On Aug 12, 9:59 pm, Ludovic Brenta <ludo...(a)ludovic-brenta.org> wrote: > In reality, an S-Expression (a "cons pair" in Lisp parlance) can indeed > not have three elements; the notation (a b c) is, really, shorthand for > (a (b c)); I was keen to implement that in my parser. See > > http://en.wikipedia.org/wiki/S-expression#Definition According to that page, cons pairs are noted with an extra dot, like (x . y), right? So (a b c) would be (a . (b . (c . nil))), right? Then my example would be: (tcp-connect . ((host . (foo.example . nil)) . ((port . (80 . nil)) . nil))) For the record, I'm just discovering the concept of cons pairs, I based my previous S-expression work only on Rivest's proposed standard http://people.csail.mit.edu/rivest/Sexp.txt Hoping this helps, Natacha |