From: Arthur Tabachneck on
Jonathan,

I have to both agree and disagree with my colleagues however, for the most
part, I agree with everything they said.

Data validation is FAR from being trivial and, at the risk of offending some
of my colleagues, shouldn't be left solely to the responsibility of
programmers.

Sure, you can write or buy routines for doing many of the tasks, but a lot
of validity checks require business knowledge that programmers might not
have and often require the talents of staff whose salaries are even higher
(believe it or not SAS programmers are not necessarily the highest paid
employees in some organizations).

Are correct codes used? Are entries reasonable and consistent? Over time can anomalies or unexpected patterns be identified?

Those questions are all components of data validity and can require anything
from a running and reviewing the results of a simple algorithm, to comparing
differences between statistical models based on samples from the data.

Who should do the work, I think, depends upon which specific task is being
done, the available staff, and the skills required.

Art
-------
On Mon, 4 Jan 2010 15:43:29 -0500, Jonathan Goldberg
<jgoldberg(a)BIOMEDSYS.COM> wrote:

>We are currently using SAS to do validation. We write programs to check
>things like ranges, all fields present, etc., etc..
>Since this is a clinical trials environment it is also necessary to check
>across records for visit squence, missing visits, etc..
>
>While we have a lot of this packaged into macros, it seems to me that
>there should be tools available that allow non-programmers to do a lot
>(preferably, all) of this. It seems a waste to need programmers to do
>something so low-level.
>
>Anyone have suggestions for products that might fill the bill?
>
>TIA.
>
>Jonathan
From: Joe Matise on
Art, I absolutely agree re: business knowledge. I suspect it has a lot to
do with the corporate culture of one's company [or university etc. etc.]; in
mine, the programmers are usually the ones with the business knowledge of
the data. Sometimes the business end of the group will have good data
skills, and we will involve them 100% in data validation, either directly
[by running reports out for them and/or giving them direct data access] or
indirectly [by asking questions]. However, that's certainly not always
true; in those cases, I'm the one who knows the data in and out, and I make
most of the decisions. I'd much prefer that not to be the case ever - it's
a lot better when you have more eyes on the data - but sometimes it is the
best case.

I do have the advantage though of being primarily tasked with particular
projects. In a corporate culture where programmers work on any given
project and projects don't have a specific programmer assigned to them, that
lack of continuous business knowledge certainly would affect whose
responsibility ultimately data cleaning/validation should be.

-Joe

On Mon, Jan 4, 2010 at 5:35 PM, Arthur Tabachneck <art297(a)netscape.net>wrote:

> Jonathan,
>
> I have to both agree and disagree with my colleagues however, for the most
> part, I agree with everything they said.
>
> Data validation is FAR from being trivial and, at the risk of offending
> some
> of my colleagues, shouldn't be left solely to the responsibility of
> programmers.
>
> Sure, you can write or buy routines for doing many of the tasks, but a lot
> of validity checks require business knowledge that programmers might not
> have and often require the talents of staff whose salaries are even higher
> (believe it or not SAS programmers are not necessarily the highest paid
> employees in some organizations).
>
> Are correct codes used? Are entries reasonable and consistent? Over time
> can anomalies or unexpected patterns be identified?
>
> Those questions are all components of data validity and can require
> anything
> from a running and reviewing the results of a simple algorithm, to
> comparing
> differences between statistical models based on samples from the data.
>
> Who should do the work, I think, depends upon which specific task is being
> done, the available staff, and the skills required.
>
> Art
> -------
> On Mon, 4 Jan 2010 15:43:29 -0500, Jonathan Goldberg
> <jgoldberg(a)BIOMEDSYS.COM> wrote:
>
> >We are currently using SAS to do validation. We write programs to check
> >things like ranges, all fields present, etc., etc..
> >Since this is a clinical trials environment it is also necessary to check
> >across records for visit squence, missing visits, etc..
> >
> >While we have a lot of this packaged into macros, it seems to me that
> >there should be tools available that allow non-programmers to do a lot
> >(preferably, all) of this. It seems a waste to need programmers to do
> >something so low-level.
> >
> >Anyone have suggestions for products that might fill the bill?
> >
> >TIA.
> >
> >Jonathan
>
From: Michael Raithel on
Dear SAS-L-ers,

Jonathan Goldberg posted the following:

> We are currently using SAS to do validation. We write programs to
> check
> things like ranges, all fields present, etc., etc..
> Since this is a clinical trials environment it is also necessary to
> check
> across records for visit squence, missing visits, etc..
>
> While we have a lot of this packaged into macros, it seems to me that
> there should be tools available that allow non-programmers to do a lot
> (preferably, all) of this. It seems a waste to need programmers to do
> something so low-level.
>
> Anyone have suggestions for products that might fill the bill?
>
Jonathan, yes, I do have a suggestion. But, before I provide it, let me jump on the bandwagon and soundly chastise you for even hinting that anything under the sun is "low-level" for programmers. For shame! If I had my way, all SAS programmers would be responsible for thoroughly cleaning their offices/cubicles every day _IN_ADDITION_TO cleaning their data! Low-level indeed!

We are on the cusp of an evaluation of SAS Data Quality Solution software (http://www.sas.com/data-quality/index.html ). We are hoping that it will be general, intuitive, and robust enough so that our talented Data Management staff can use it for data validation. That would free up more SAS programmers... so that they could clean their offices:-) You might want to check this software out, too.

Jonathan, best of luck in all of your SAS endeavors!


I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel(a)westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Beginnings are always messy. - John Galsworthy
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
From: "Keintz, H. Mark" on
Michael R said:


> >
> Jonathan, yes, I do have a suggestion. But, before I provide it, let
> me jump on the bandwagon and soundly chastise you for even hinting that
> anything under the sun is "low-level" for programmers. For shame! If
> I had my way, all SAS programmers would be responsible for thoroughly
> cleaning their offices/cubicles every day _IN_ADDITION_TO cleaning
> their data! Low-level indeed!
>
> We are on the cusp of an evaluation of SAS Data Quality Solution
> software (http://www.sas.com/data-quality/index.html ). We are hoping
> that it will be general, intuitive, and robust enough so that our
> talented Data Management staff can use it for data validation. That
> would free up more SAS programmers... so that they could clean their
> offices:-) You might want to check this software out, too.
>

But Michael ...


Does SAS have no office cleaning software? Are there no apps? No procs? No macros?

Regards,
Mark
From: Michael Raithel on
Dear SAS-L-ers,

Mark H. Keintz posted the following:

> Michael R said:
>
>
> > >
> > Jonathan, yes, I do have a suggestion. But, before I provide it, let
> > me jump on the bandwagon and soundly chastise you for even hinting
> that
> > anything under the sun is "low-level" for programmers. For shame!
> If
> > I had my way, all SAS programmers would be responsible for thoroughly
> > cleaning their offices/cubicles every day _IN_ADDITION_TO cleaning
> > their data! Low-level indeed!
> >
> > We are on the cusp of an evaluation of SAS Data Quality Solution
> > software (http://www.sas.com/data-quality/index.html ). We are
> hoping
> > that it will be general, intuitive, and robust enough so that our
> > talented Data Management staff can use it for data validation. That
> > would free up more SAS programmers... so that they could clean their
> > offices:-) You might want to check this software out, too.
> >
>
> But Michael ...
>
> Does SAS have no office cleaning software? Are there no apps? No
> procs? No macros?
>
Mark, as usual, you ask the tough questions. Sadly, SAS is lacking in the PROC MANUAL LABOR areas. However, I did find another organization that offers software that will actually clean your computer screen from the inside! Check out:

http://www.raincitystory.com/flash/screenclean.swf

Mark, best of luck in all of your SAS endeavors!


I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel(a)westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A boy can learn a lot from a dog: obedience, loyalty, and the
importance of turning around three times before lying down. - Robert Benchley
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++