From: Ben Bacarisse on
Ben Pfaff <blp(a)cs.stanford.edu> writes:

> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> writes:
>
>> For example, regular expressions is a quite weak class of
>> languages incapable to recognize simple bracket constructs,
>> very difficult and uncomfortable to use. Nevertheless it is
>> basically the only pattern language known, except for
>> wild-cards patterns. Far better and simpler SNOBOL patterns are
>> forgotten.
>
> That's very interesting. I have never heard of SNOBOL patterns
> before.

They are very powerful but a little "operational" in that some of the
time you feel you are telling the matcher what to _do_ rather than
what to match. None the less, SNOBOL left a deep impression on me. I
recall an implementation of Russel's paradox in SNOBOL: a pattern, R,
that matches only those patterns that don't match themselves. You
then ask if R matches R!

Have you come across Icon[1]? Designed, at least in part, by Griswold
of SNOBOL fame. It have some very elegant ideas. It is a shame it is
not more widely know.

[1] http://www.cs.arizona.edu/icon/

<snip>
--
Ben.
From: Ben Pfaff on
Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes:

> Ben Pfaff <blp(a)cs.stanford.edu> writes:
>
>> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> writes:
>>
>>> For example, regular expressions is a quite weak class of
>>> languages incapable to recognize simple bracket constructs,
>>> very difficult and uncomfortable to use. Nevertheless it is
>>> basically the only pattern language known, except for
>>> wild-cards patterns. Far better and simpler SNOBOL patterns are
>>> forgotten.
>>
>> That's very interesting. I have never heard of SNOBOL patterns
>> before.
>
> They are very powerful but a little "operational" in that some of the
> time you feel you are telling the matcher what to _do_ rather than
> what to match. None the less, SNOBOL left a deep impression on me. I
> recall an implementation of Russel's paradox in SNOBOL: a pattern, R,
> that matches only those patterns that don't match themselves. You
> then ask if R matches R!

I looked around the web for a while and found a few superficial
descriptions of SNOBOL patterns. I also found a few SNOBOL
reference manuals, but again their descriptions of patterns were
very brief and I found it difficult to figure out how they were
used and why exactly they were so powerful.

Can you give a few examples?

> Have you come across Icon[1]? Designed, at least in part, by Griswold
> of SNOBOL fame. It have some very elegant ideas. It is a shame it is
> not more widely know.
>
> [1] http://www.cs.arizona.edu/icon/

I'm scanning through the book now:
http://www.cs.arizona.edu/icon/ftp/doc/lb1up.pdf
--
"...dans ce pays-ci il est bon de tuer de temps en temps un amiral
pour encourager les autres."
--Voltaire, _Candide_
From: Rod Pemberton on
"Ben Bacarisse" <ben.usenet(a)bsb.me.uk> wrote in message
news:0.0d2ddf5bbc876993d58c.20091105182024GMT.877hu43kbb.fsf(a)bsb.me.uk...
> Have you come across Icon[1]? Designed, at least in part, by Griswold
> of SNOBOL fame. It have some very elegant ideas. It is a shame it is
> not more widely know.
>
> [1] http://www.cs.arizona.edu/icon/
>

Icon was used to write GBURG parser by Chris Fraser and Todd Proebsting in
"Finite-State Code Generation". One or both of them and/or David Hanson
worked on other similarly named parsers BURG, IBURG. Chris Fraser and David
Hanson are the ones who created the LCC C compiler.


Rod Pemberton


From: Ben Bacarisse on
Ben Pfaff <blp(a)cs.stanford.edu> writes:

> Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes:
>
>> Ben Pfaff <blp(a)cs.stanford.edu> writes:
>>
>>> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> writes:
>>>
>>>> For example, regular expressions is a quite weak class of
>>>> languages incapable to recognize simple bracket constructs,
>>>> very difficult and uncomfortable to use. Nevertheless it is
>>>> basically the only pattern language known, except for
>>>> wild-cards patterns. Far better and simpler SNOBOL patterns are
>>>> forgotten.
>>>
>>> That's very interesting. I have never heard of SNOBOL patterns
>>> before.
>>
>> They are very powerful but a little "operational" in that some of the
>> time you feel you are telling the matcher what to _do_ rather than
>> what to match. None the less, SNOBOL left a deep impression on me. I
>> recall an implementation of Russel's paradox in SNOBOL: a pattern, R,
>> that matches only those patterns that don't match themselves. You
>> then ask if R matches R!
>
> I looked around the web for a while and found a few superficial
> descriptions of SNOBOL patterns. I also found a few SNOBOL
> reference manuals, but again their descriptions of patterns were
> very brief and I found it difficult to figure out how they were
> used and why exactly they were so powerful.
>
> Can you give a few examples?

The main feature is that patterns are built from pattern primitives
using operators. Patterns are named so you can write:

DIGITS = SPAN('0123456789')
NUM = DIGITS | ( '+' | '-' ) DIGITS

The name is substituted with its value immediately unless you use what
is called an unevaluated expression:

P = *P 'a' *P | 'z'

matches the strings 'z', 'zaz', 'zazazaz' and so on. The classic
example being that you can match strings with balanced parentheses:

PAIRED = NOTANY('()') | '(' ARBNO(*PAIRED) ')'
BALANCED = PAIRED ARBNO(PAIRED)

NOTANY matches a string without any on the characters in the string
argument ([^()]* as a regular expression) and ARBNO matches an
arbitrary number of repetitions of a pattern (* in the more common RE
notation).

My comment about it being rather operational comes from the fact that
there are primitives that interact directly with the scanning
algorithm. For example, FAIL always simply fails to match causing the
scanner to backtrack if it can and a lot of the power comes from
assignments done mid-match, using the $ operator.

For example,

LEN(1) $ IT *IT

matches and repeated character. LEN(1) matches any single character
and $ IT assigns the matched character to the variable IT at that
point in the scan. *IT refers to the character stored, thereby
matching only double letters.

Output is done but assigning to the variable OUTPUT, so to illustrate
FAIL one could write:

'MISSISSIPPI' (LEN(1) $ X *X) $ OUTPUT FAIL

to print:

SS
SS
PP

Without the FAIL, only the first pair would be printed since the
pattern would have succeeded.

--
Ben.
From: James Dow Allen on
On Nov 6, 1:36 am, Ben Pfaff <b...(a)cs.stanford.edu> wrote:
> I looked around the web for a while and found a few superficial
> descriptions of SNOBOL patterns....
>
> Can you give a few examples?

Another nice feature of the SNOBOL programming language
is that the string matching a subpattern can be saved in
a variable. IIRC,
A B.X C
is the same as
A B C
but the string matching the subpattern B is saved in X.

I've often wanted to use this feature when sed'ing, e.g.
sed "sqhas [0-9]* wivesqwill have N happy widowsq"
where "N" is the matching [0-9]* remembered.

James Dow Allen
First  |  Prev  |  Next  |  Last
Pages: 1 2 3
Prev: Displaying pcx to windows?
Next: Message trees can be fast