From: Lou on

"Jonathan Goldberg" <jgoldberg(a)BIOMEDSYS.COM> wrote in message
news:201001051736.o05Gm3oA007401(a)malibu.cc.uga.edu...
> Well, I can't complain that I didn't get responses to my query. True,
> most of them were a bit snippy...
>
> The messages mostly were about how big an investment a piledriver is when
> all I'm looking for is a hammer. Our situation is relatively simple; we
> don't have compliciated normalization schemes or most other possible
> complications.
>
> The idea of a tool is to put data cleaning as much as possible *in the
> hands and under the control of the people who know the data*. Who are
> also the people who will deal with any problems found. The need to
> involve programmers slows down the projects and increases costs.

Pardon me for releasing a brief giggle. If involving a programmer slows
things down and increases costs, then don't involve a programmer.


From: Michael Raithel on
Dear SAS-L-ers,

Francois van der Walt posted the following:

> Dear Jonathan and SAS-L (ers)
>
> Data cleaning is certainly important and the value of clean data is
> often
> underestimated. Interestingly it is for us (GJI) often the easiest
> service
> to sell.
>
> The biggest bang for buck that we use in the data cleaning process and
> that
> I can recommend as an excellent starting point is Characterise under
> Enterprise Guide. I am sure it use to be macro's developed and some
> SAS-L
> ers will be able to refer you to it. (If you do not have Enterprise
> Guide
> available let me know and I will provide you with an extract of the
> macro's)
>
> Characterise provides a frequency analysis for all alpha fields (top 30
> by
> default) that we use to quickly identify problems like blank fields or
> lots
> of "N/A", "TBA", "TEST", "HJKL" etc in fields. We ask business owners
> to
> identify the valid versus invalid values in a extracted spreadsheet. We
> also
> use it to generate a translation table that for example translate the
> Australian state "Victoria", "VIC.", "V.I.C." etc to a consistent
> "VIC".
>
> For numeric fields Characterise provides number of missings averages,
> maximums, minimums etc.
>

Francois, G'day; great post!

I am totally intrigued by the "Characterize Data" task of SAS/Enterprise Guide that you mentioned, above. However, a 15-minute search of the SAS Institute web site only garnered me a paltry amount of information about it--mostly descriptions of only a few words. E.g.

Characterize Data - Enables the Characterize Data task.

Characterize Data - Enables the Characterize Data task.

Could you or anybody else point me to the mother lode of online information on the Characterize Data task for SAS/Enterprise Guide? (Please, please, please don't go looking if you don't already know where it is. I don't want more than one of us wasting his/her time on a snipe hunt)!

Oh, and I can assure everyone that this is not a homework assignment... yet:-)

Francois, best of luck in all of your SAS endeavors!


I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel(a)westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SAS is the leader in business analytics software and services,
and the largest independent vendor in the business intelligence
market. - SAS Institute Web Site
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
From: Mike Zdeb on
hi ...

"CHARACTERIZE DATA" ...

data x;
gender = 1;
c_gender = put(gender,1.);
run;


"NUMERICIZE DATA" ...

data x;
gender = '1';
n_gender = input(gender,1.);
run;


--
Mike Zdeb
U(a)Albany School of Public Health
One University Place
Rensselaer, New York 12144-3456
P/518-402-6479 F/630-604-1475

>
> Francois, G'day; great post!
>
> I am totally intrigued by the "Characterize Data" task of SAS/Enterprise Guide that you mentioned, above. However, a 15-minute search of the SAS
> Institute web site only garnered me a paltry amount of information about it--mostly descriptions of only a few words. E.g.
>
> Characterize Data - Enables the Characterize Data task.
>
> Characterize Data - Enables the Characterize Data task.
>
> Could you or anybody else point me to the mother lode of online information on the Characterize Data task for SAS/Enterprise Guide? (Please,
> please, please don't go looking if you don't already know where it is. I don't want more than one of us wasting his/her time on a snipe hunt)!
>
> Oh, and I can assure everyone that this is not a homework assignment... yet:-)
>
> Francois, best of luck in all of your SAS endeavors!
>
>
> I hope that this suggestion proves helpful now, and in the future!
>
> Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or
> methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or
> applicability. People deciding to use information in this posting do so at their own risk.
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Michael A. Raithel
> "The man who wrote the book on performance"
> E-mail: MichaelRaithel(a)westat.com
>
> Author: Tuning SAS Applications in the MVS Environment
>
> Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
> http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172
>
> Author: The Complete Guide to SAS Indexes
> http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> SAS is the leader in business analytics software and services,
> and the largest independent vendor in the business intelligence
> market. - SAS Institute Web Site
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
From: Nathaniel Wooding on
Miiikkkkkkkeee is guilty of using slang!!!!!!

For non-native English speakers, "snipe hunt" refers to a futile quest. Specifically, you take a bunch of unsuspecting kids who are on, say, a camping trip, and tell some wild stories about a very tame bird that lives in the area and which is easy to catch with a bucket or large bag. Then after dark, you scatter the kids out in a patch of woods and leave them for an hour or two. It is sort of a rite of passage.

And yes, I, too, was so suckered.

Nat Wooding (who just used slang, himself)

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Michael Raithel
Sent: Wednesday, January 06, 2010 1:42 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: Data Validation/Cleansing Tool Query

Dear SAS-L-ers,

Francois van der Walt posted the following:

> Dear Jonathan and SAS-L (ers)
>
> Data cleaning is certainly important and the value of clean data is
> often
> underestimated. Interestingly it is for us (GJI) often the easiest
> service
> to sell.
>
> The biggest bang for buck that we use in the data cleaning process and
> that
> I can recommend as an excellent starting point is Characterise under
> Enterprise Guide. I am sure it use to be macro's developed and some
> SAS-L
> ers will be able to refer you to it. (If you do not have Enterprise
> Guide
> available let me know and I will provide you with an extract of the
> macro's)
>
> Characterise provides a frequency analysis for all alpha fields (top 30
> by
> default) that we use to quickly identify problems like blank fields or
> lots
> of "N/A", "TBA", "TEST", "HJKL" etc in fields. We ask business owners
> to
> identify the valid versus invalid values in a extracted spreadsheet. We
> also
> use it to generate a translation table that for example translate the
> Australian state "Victoria", "VIC.", "V.I.C." etc to a consistent
> "VIC".
>
> For numeric fields Characterise provides number of missings averages,
> maximums, minimums etc.
>

Francois, G'day; great post!

I am totally intrigued by the "Characterize Data" task of SAS/Enterprise Guide that you mentioned, above. However, a 15-minute search of the SAS Institute web site only garnered me a paltry amount of information about it--mostly descriptions of only a few words. E.g.

Characterize Data - Enables the Characterize Data task.

Characterize Data - Enables the Characterize Data task.

Could you or anybody else point me to the mother lode of online information on the Characterize Data task for SAS/Enterprise Guide? (Please, please, please don't go looking if you don't already know where it is. I don't want more than one of us wasting his/her time on a snipe hunt)!

Oh, and I can assure everyone that this is not a homework assignment... yet:-)

Francois, best of luck in all of your SAS endeavors!


I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel(a)westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SAS is the leader in business analytics software and services,
and the largest independent vendor in the business intelligence
market. - SAS Institute Web Site
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CONFIDENTIALITY NOTICE: This electronic message contains
information which may be legally confidential and or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect. The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized. If you are not the intended
recipient, any disclosure, copying, distribution, or use of the
contents of this information is prohibited and may be unlawful. If
you have received this electronic transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it. Thank you.