From: Lance Smith on
Hi

I have a dataset with 50 character variables (SNP1 - SNP50), each of
which have a certain amount of missing data. I want to create a table
that will give me the percentage of missing data per variable. Maybe
something like this:

VARIABLE N %MISSING
SNP1 2010 2.6%
..
..
..
..
..
SNP50 2010 5%

How do I do this in SAS?
Thank you for your help.
From: data _null_; on
On Feb 2, 11:10 pm, Lance Smith <medicaltr...(a)gmail.com> wrote:
> Hi
>
> I have a dataset with 50 character variables (SNP1 - SNP50), each of
> which have a certain amount of missing data. I want to create a table
> that will give me the percentage of missing data per variable. Maybe
> something like this:
>
> VARIABLE        N            %MISSING
> SNP1               2010          2.6%
> .
> .
> .
> .
> .
> SNP50             2010              5%
>
> How do I do this in SAS?
> Thank you for your help.

data test;
array snp[50];
do id = 1 to 87;
do _n_ = 1 to dim(snp);
if rantbl(4,.3) eq 2
then snp[_n_] = rannor(4);
else snp[_n_] = .;
end;
output;
end;
run;

ods listing close;
proc univariate data=test;
var snp:;
ods output MissingValues=MissingValues;
run;
ods listing;

proc contents data=MissingValues varnum;
run;
proc print data=MissingValues;
run;
From: smolkatz on
On Feb 3, 7:10 am, Lance Smith <medicaltr...(a)gmail.com> wrote:
> Hi
>
> I have a dataset with 50 character variables (SNP1 - SNP50), each of
> which have a certain amount of missing data. I want to create a table
> that will give me the percentage of missing data per variable. Maybe
> something like this:
>
> VARIABLE        N            %MISSING
> SNP1               2010          2.6%
> .
> .
> .
> .
> .
> SNP50             2010              5%
>
> How do I do this in SAS?
> Thank you for your help.

Hi,
try this, but change snp2 to snp50

data have;
input snp1 $ snp2 $;
cards;
a .
a .
.. a
.. a
b .
.. b
.. .
c .
c .
.. b
;
run;

data want;
array mis mis1-mis2;
do over mis;
mis=0;
end;
do until (endOfFile);
set have end=endOfFile;
array snp snp1-snp2;
do over snp;
if missing(snp) then mis+1;
end;
n+1;
array p p1-p2;
do over p;
p=mis/n*100;
end;
keep p1-p2;
run;
proc transpose data=want;
run;

The best,
Alex
From: "Data _null_;" on
I missed the whole character variable part. Here is another attempt.

data test;
array snp[5] $;
do id = 1 to 87;
do _n_ = 1 to dim(snp);
if rantbl(4,.3) eq 2
then snp[_n_] = put(rannor(4),best8.);
else snp[_n_] = ' ';
end;
output;
end;
run;
proc format;
value $miss ' '=' ' other='1';
run;
ods listing close;
proc freq;
tables _char_ / missing nocum;
format _char_ $miss1.;
ods output onewayfreqs=freqs;
run;
ods listing;
data freqs;
set freqs;
if missing(cats(of f_:));
keep Table Frequency Percent;
run;
proc print;
run;

On 2/3/10, data _null_; <datanull(a)gmail.com> wrote:
> On Feb 2, 11:10 pm, Lance Smith <medicaltr...(a)gmail.com> wrote:
> > Hi
> >
> > I have a dataset with 50 character variables (SNP1 - SNP50), each of
> > which have a certain amount of missing data. I want to create a table
> > that will give me the percentage of missing data per variable. Maybe
> > something like this:
> >
> > VARIABLE N %MISSING
> > SNP1 2010 2.6%
> > .
> > .
> > .
> > .
> > .
> > SNP50 2010 5%
> >
> > How do I do this in SAS?
> > Thank you for your help.
>
> data test;
> array snp[50];
> do id = 1 to 87;
> do _n_ = 1 to dim(snp);
> if rantbl(4,.3) eq 2
> then snp[_n_] = rannor(4);
> else snp[_n_] = .;
> end;
> output;
> end;
> run;
>
> ods listing close;
> proc univariate data=test;
> var snp:;
> ods output MissingValues=MissingValues;
> run;
> ods listing;
>
> proc contents data=MissingValues varnum;
> run;
> proc print data=MissingValues;
> run;
>
From: Muthia Kachirayan on
Lance,

Yet another array way:

Sample Data:

%let N = 2010;
data have;
array s[*] $1 SNP1 - SNP50;
do id = 1 to &N;
do i = 1 to 50;
if ranuni(123) > .3 then s[i] = '1';
else s[i] = ' ';
end;
output;
end;
drop i;
run;

Array solution:

data need;
array k[50] _temporary_ ;
do until(eof);
set have end = eof;
array s[*] $1 SNP1 - SNP50;
do _n_ = 1 to dim(k);
if s[_n_] ne ' ' then k[_n_] ++ 1;
end;
end;
do i = 1 to dim(k);
k[i] = (&N - k[i]) / &N * 100;
Name = vname(s[i]);
Missing = k[i];
output;
end;
keep Name Missing;
run;

proc print data = need;
run;

Kind regards,
Muthia Kachirayan
On Wed, Feb 3, 2010 at 1:10 AM, Lance Smith <medicaltrial(a)gmail.com> wrote:

> Hi
>
> I have a dataset with 50 character variables (SNP1 - SNP50), each of
> which have a certain amount of missing data. I want to create a table
> that will give me the percentage of missing data per variable. Maybe
> something like this:
>
> VARIABLE N %MISSING
> SNP1 2010 2.6%
> .
> .
> .
> .
> .
> SNP50 2010 5%
>
> How do I do this in SAS?
> Thank you for your help.
>