From: proton on
On Jun 29, 4:20 pm, p...(a)informatimago.com (Pascal J. Bourguignon)
wrote:
> proton <leosara...(a)gmail.com> writes:
> > On Jun 29, 1:11 pm, Tamas K Papp <tkp...(a)gmail.com> wrote:
> >> On Tue, 29 Jun 2010 02:54:13 -0700, proton wrote:
> >> > Hi all,
>
> >> > I am trying to read a text file using read-line within a loop. In the
> >> > file there is a line with one single word followed by a colon,
> >> > "follows:".
> >> > When the program reaches that line, instead of processing it as the
> >> > rest, it takes it as the definition of a package. Then, the next line it
> >> > returns an error "Unknown follows: package".
>
> >> If you are using just read-line in a loop, then I find that behavior
> >> surprising, as readline does no parsing (other than detecting newlines).
>
> >> > Is there a way to prevent this and treat the line with the colon as a
> >> > normal string?
>
> >> Maybe you should post the code so we could see what's going on.
>
> > My apologies, I forgot to mention that after a read-line, I do a read-
> > from-string. That s where the error occurs.
>
> What would you expect then?
>
> Why would you want to use READ-FROM-STRING, if your string doesn't
> contain a lisp form?
>
> --
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?
> ---------->http://www.netmeister.org/news/learn2quote.html<-----------
> --->http://homepage.ntlworld.com/g.mccaughan/g/remarks/uquote.html<---
>
> __Pascal Bourguignon__                    http://www.informatimago.com/

I see the problem. The reason why I am doing read-from-string is
because I want to get the text word by word, but I understand now that
it parses each word as Lisp input. Any ideas on how to read word by
word without too much fuss, and without interpreting the words?

Thanks a lot for your help.

This newbie.
From: vanekl on
proton wrote:
> I see the problem. The reason why I am doing read-from-string is
> because I want to get the text word by word, but I understand now that
> it parses each word as Lisp input. Any ideas on how to read word by
> word without too much fuss, and without interpreting the words?

there are about ten-gazillion different methods. This is one of them:

(split-sequence:split-sequence #\Space string :remove-empty-subseqs t)

>
> Thanks a lot for your help.
>
> This newbie.


From: Joshua Taylor on
On 2010.06.29 10:36 AM, proton wrote:
> On Jun 29, 4:20 pm, p...(a)informatimago.com (Pascal J. Bourguignon)
> wrote:
>> proton <leosara...(a)gmail.com> writes:
>>> On Jun 29, 1:11 pm, Tamas K Papp <tkp...(a)gmail.com> wrote:
>>>> On Tue, 29 Jun 2010 02:54:13 -0700, proton wrote:
>>>>> Hi all,
>>
>>>>> I am trying to read a text file using read-line within a loop. In the
>>>>> file there is a line with one single word followed by a colon,
>>>>> "follows:".
>>>>> When the program reaches that line, instead of processing it as the
>>>>> rest, it takes it as the definition of a package. Then, the next line it
>>>>> returns an error "Unknown follows: package".
>>
>>>> If you are using just read-line in a loop, then I find that behavior
>>>> surprising, as readline does no parsing (other than detecting newlines).
>>
>>>>> Is there a way to prevent this and treat the line with the colon as a
>>>>> normal string?
>>
>>>> Maybe you should post the code so we could see what's going on.
>>
>>> My apologies, I forgot to mention that after a read-line, I do a read-
>>> from-string. That s where the error occurs.
>>
>> What would you expect then?
>>
>> Why would you want to use READ-FROM-STRING, if your string doesn't
>> contain a lisp form?
>>
>> --
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>> A: Top-posting.
>> Q: What is the most annoying thing on usenet and in e-mail?
>> ---------->http://www.netmeister.org/news/learn2quote.html<-----------
>> --->http://homepage.ntlworld.com/g.mccaughan/g/remarks/uquote.html<---
>>
>> __Pascal Bourguignon__ http://www.informatimago.com/
>
> I see the problem. The reason why I am doing read-from-string is
> because I want to get the text word by word, but I understand now that
> it parses each word as Lisp input. Any ideas on how to read word by
> word without too much fuss, and without interpreting the words?
>
> Thanks a lot for your help.
>
> This newbie.

You'll need to specify what constitutes a word. For instance, in your
case, when "follows:" appears in a line, is "follows" the word? Is
"follows:" the word? Are words sequences of alphanumeric characters?
Sequences of non-whitespace chararacters? And so on. But if you've got
a specification of what a word is (and it's not too complicated) it
sounds like you want to split the line into words separated by non-word
characters. You might take a look at SPLIT-SEQUENCE [1] that does just
that. Some people might use CL-PPCRE which provides a SPLIT [2]
function (though you probably don't need the full power of regular
expressions here). Some implementations also provide similar or
equivalent functionality. For instance, LispWorks provides
LISPWORKS:SPLIT-SEQUENCE.

//JT

[1] http://www.cliki.net/SPLIT-SEQUENCE
[2] http://weitz.de/cl-ppcre/#split
From: Teemu Likonen on
* 2010-06-29 07:36 (-0700), proton wrote:

> I see the problem. The reason why I am doing read-from-string is
> because I want to get the text word by word, but I understand now that
> it parses each word as Lisp input. Any ideas on how to read word by
> word without too much fuss, and without interpreting the words?

If your word splitting means just tokens separated by spaces, tabs or
newlines then this function may do:

(defun split-string-ws (string)
(loop
for pos upfrom 0
with len = (length string)
with separators = '(#\Space #\Tab #\Newline)
until (>= pos len)
unless (member (elt string pos) separators)
collect (loop
with start = pos
if (or (>= (incf pos) len)
(member (elt string pos) separators))
return (subseq string start pos))))