From: Tan on
I have a quick question. I am hoping some has an answer to help me,
since i can't seem to find an answer in any textbooks or website.

So basically, I have a dataset with a variable like this:

Andover town, Tolland County
North Canaan town, Litchfield County,

So what I had in mind was:

Town1= scan(Town,1);

But this would only grab the first word, some towns have two words
(like North Canaan above). But if i set it to 2, then towns with 1
words would be "Andover town". Is there a way to specify the scan
function to grab whatever is before the word town? and then I can use
the trim function to delete the space between the town name and town?

Thanks
From: Reeza on
On Apr 13, 1:09 pm, Tan <tan.p.p...(a)gmail.com> wrote:
> I have a quick question. I am hoping some has an answer to help me,
> since i can't seem to find an answer in any textbooks or website.
>
> So basically, I have a dataset with a variable like this:
>
> Andover town, Tolland County
> North Canaan town, Litchfield County,
>
> So what I had in mind was:
>
> Town1= scan(Town,1);
>
> But this would only grab the first word, some towns have two words
> (like North Canaan above). But if i set it to 2, then towns with 1
> words would be "Andover town".  Is there a way to specify the scan
> function to grab whatever is before the word town? and then I can use
> the trim function to delete the space between the town name and town?
>
> Thanks

It looks like its comma delimited try scan(Town,1,",").

If its not comma delimited search for the first instance of town and
then substring based on that.

substr(town, 1, find(town, "town", ,1)+4). Keep an eye on the case.

I'm sure you can do it in one step using expressions but I don't know
how to do that :)

HTH,
Reeza
From: Barry Schwarz on
Use INDEXW to find the first occurrence of "town" in variable town.
The substring you are interested runs from position 1 to the return
value - 2. If the return value is 0, use INDEX to find the comma and
then extract the substring.

On Tue, 13 Apr 2010 13:09:22 -0700 (PDT), Tan <tan.p.pham(a)gmail.com>
wrote:

>I have a quick question. I am hoping some has an answer to help me,
>since i can't seem to find an answer in any textbooks or website.
>
>So basically, I have a dataset with a variable like this:
>
>Andover town, Tolland County
>North Canaan town, Litchfield County,
>
>So what I had in mind was:
>
>Town1= scan(Town,1);
>
>But this would only grab the first word, some towns have two words
>(like North Canaan above). But if i set it to 2, then towns with 1
>words would be "Andover town". Is there a way to specify the scan
>function to grab whatever is before the word town? and then I can use
>the trim function to delete the space between the town name and town?
>
>Thanks

--
Remove del for email
From: Patrick on
scan() is fine. Just use as third element a ',' as delimiter like
Reeza suggests.

And this is the link:

http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000214639.htm
From: Richard A. DeVenezia on
On Apr 13, 4:09 pm, Tan <tan.p.p...(a)gmail.com> wrote:
> I have a quick question. I am hoping some has an answer to help me,
> since i can't seem to find an answer in any textbooks or website.
>
> So basically, I have a dataset with a variable like this:
>
> Andover town, Tolland County
> North Canaan town, Litchfield County,
>
> So what I had in mind was:
>
> Town1= scan(Town,1);
>
> But this would only grab the first word, some towns have two words
> (like North Canaan above). But if i set it to 2, then towns with 1
> words would be "Andover town".  Is there a way to specify the scan
> function to grab whatever is before the word town? and then I can use
> the trim function to delete the space between the town name and town?

Regular expressions are powerful tools for processing texts and
available in SAS.

data foo;
input;
myVariable = _infile_;
datalines;
Andover town, Tolland County
North Canaan town, Litchfield County,
run;

data foofoo;
set foo;
retain prxid ;

if _n_ = 1 then
prxid = prxParse('/\s*(.*)\s*town\s*,/i');

if prxMatch (prxid,myVariable) then
townname = prxPosN(prxid,1,myVariable);

drop prxid;
run;


--
Richard A. DeVenezia
http://www.devenezia.com