From: Tan on 13 Apr 2010 16:09 I have a quick question. I am hoping some has an answer to help me, since i can't seem to find an answer in any textbooks or website. So basically, I have a dataset with a variable like this: Andover town, Tolland County North Canaan town, Litchfield County, So what I had in mind was: Town1= scan(Town,1); But this would only grab the first word, some towns have two words (like North Canaan above). But if i set it to 2, then towns with 1 words would be "Andover town". Is there a way to specify the scan function to grab whatever is before the word town? and then I can use the trim function to delete the space between the town name and town? Thanks
From: Reeza on 13 Apr 2010 22:59 On Apr 13, 1:09 pm, Tan <tan.p.p...(a)gmail.com> wrote: > I have a quick question. I am hoping some has an answer to help me, > since i can't seem to find an answer in any textbooks or website. > > So basically, I have a dataset with a variable like this: > > Andover town, Tolland County > North Canaan town, Litchfield County, > > So what I had in mind was: > > Town1= scan(Town,1); > > But this would only grab the first word, some towns have two words > (like North Canaan above). But if i set it to 2, then towns with 1 > words would be "Andover town". Is there a way to specify the scan > function to grab whatever is before the word town? and then I can use > the trim function to delete the space between the town name and town? > > Thanks It looks like its comma delimited try scan(Town,1,","). If its not comma delimited search for the first instance of town and then substring based on that. substr(town, 1, find(town, "town", ,1)+4). Keep an eye on the case. I'm sure you can do it in one step using expressions but I don't know how to do that :) HTH, Reeza
From: Barry Schwarz on 14 Apr 2010 00:08 Use INDEXW to find the first occurrence of "town" in variable town. The substring you are interested runs from position 1 to the return value - 2. If the return value is 0, use INDEX to find the comma and then extract the substring. On Tue, 13 Apr 2010 13:09:22 -0700 (PDT), Tan <tan.p.pham(a)gmail.com> wrote: >I have a quick question. I am hoping some has an answer to help me, >since i can't seem to find an answer in any textbooks or website. > >So basically, I have a dataset with a variable like this: > >Andover town, Tolland County >North Canaan town, Litchfield County, > >So what I had in mind was: > >Town1= scan(Town,1); > >But this would only grab the first word, some towns have two words >(like North Canaan above). But if i set it to 2, then towns with 1 >words would be "Andover town". Is there a way to specify the scan >function to grab whatever is before the word town? and then I can use >the trim function to delete the space between the town name and town? > >Thanks -- Remove del for email
From: Patrick on 14 Apr 2010 05:56 scan() is fine. Just use as third element a ',' as delimiter like Reeza suggests. And this is the link: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000214639.htm
From: Richard A. DeVenezia on 14 Apr 2010 15:56 On Apr 13, 4:09 pm, Tan <tan.p.p...(a)gmail.com> wrote: > I have a quick question. I am hoping some has an answer to help me, > since i can't seem to find an answer in any textbooks or website. > > So basically, I have a dataset with a variable like this: > > Andover town, Tolland County > North Canaan town, Litchfield County, > > So what I had in mind was: > > Town1= scan(Town,1); > > But this would only grab the first word, some towns have two words > (like North Canaan above). But if i set it to 2, then towns with 1 > words would be "Andover town". Is there a way to specify the scan > function to grab whatever is before the word town? and then I can use > the trim function to delete the space between the town name and town? Regular expressions are powerful tools for processing texts and available in SAS. data foo; input; myVariable = _infile_; datalines; Andover town, Tolland County North Canaan town, Litchfield County, run; data foofoo; set foo; retain prxid ; if _n_ = 1 then prxid = prxParse('/\s*(.*)\s*town\s*,/i'); if prxMatch (prxid,myVariable) then townname = prxPosN(prxid,1,myVariable); drop prxid; run; -- Richard A. DeVenezia http://www.devenezia.com
|
Pages: 1 Prev: survival Next: Standard deviation of elements of a vector in proc iml? |