From: muriel on 22 Jun 2010 12:15 Hi Richard, How about using a macro (see below)? *********** program ********** data have; input ID Q X T$ 15-43 U; cards; 1 2 6 Greece 4 1 8 6 United Kingdom 6 1 4 6 Spain 6 2 7 3 Norway 5 3 2 3 Finland 5 3 1 3 Greece 6 3 2 2 Germany 8 3 4 1 Democratic Republic of Congo 6 ; data have1; set have; by id; if first.id then n_=0; n_+1; run; %macro trans(indsn= , outdsn= , transvar= ); proc transpose data=&indsn. out=&outdsn. prefix=&transvar.; by id; var &transvar.; id n_; run; %mend; %trans(indsn=have1 , outdsn=set1, transvar=X ); %trans(indsn=have1 , outdsn=set2, transvar=T ); %trans(indsn=have1 , outdsn=set3, transvar=U ); data result; merge set1 (drop=_name_) set2 (drop =_name_) set3 (drop=_name_); by id ; run; proc print; run; ************* print result ************** Obs ID X1 X2 X3 X4 T1 T2 T3 T4 U1 U2 U3 U4 1 1 6 6 6 . Greece United Kingdom Spain 4 6 6 . 2 2 3 . . . Norway 5 . . . 3 3 3 3 2 1 Finland Greece Germany Democratic Republic of Congo 5 6 8 6 *********************************** Muriel On Jun 22, 2:40 am, Richard Stat <richardstatis...(a)gmail.com> wrote: > Dear Ya and others, > > the solution presented by Ya worked for most of my datasets, but then > I realized that I also have datasets with character variables. By > using Ya's code, the look like this: > > data have; > input ID Q X T U; > cards; > 1 2 6 Greece 4 > 1 8 6 United Kingdom 6 > 1 4 6 Spain 6 > 2 7 3 Norway 5 > 3 2 3 Finland 5 > 3 1 3 Greece 6 > 3 2 2 Germany 8 > 3 4 1 Democratic Republic of Congo 6 > ; > > The datasets can have both numeric and character variables, and they > can be one or many. The character variables can consist of several > words and be of different length. I still want the same end result > with one observation per ID, but with character variables. In the > example above, T1-T4 would be character variables and should be of > such length that they include the longest value in T. > > Any help would be greatly appreciated. > > All the best, > Richard > > On 21 Juni, 23:56, Richard Stat <richardstatis...(a)gmail.com> wrote: > > > That was a very nice solution. It solved by problem in a very good > > way. Thank you so much, Ya! > > > All the best, Richard > > > On 21 Juni, 22:49, Ya <huang8...(a)gmail.com> wrote: > > > > On Jun 21, 12:30 pm, Richard Stat <richardstatis...(a)gmail.com> wrote: > > > > > Dear all, > > > > > I have a dataset that I want to restructure. It has multiple > > > > observations for each ID, but I don't know how many duplicates the > > > > dataset will have for each ID and it differs from ID to ID. It looks > > > > something like this: > > > > > ID Q X Y Z > > > > 1 2 6 8 4 > > > > 1 8 6 9 6 > > > > 1 4 6 9 6 > > > > 2 7 3 8 5 > > > > 3 2 3 2 5 > > > > 3 1 3 2 6 > > > > 3 2 2 2 8 > > > > 3 4 1 4 6 > > > > [...] > > > > > And I want to restructure it so that I get one observation for each > > > > ID, i.e. > > > > > ID Q1 Q2 Q3 Q4 X1 X2 X3 X4 Y1 Y2 Y3 Y4 > > > > Z1 Z2 Z3 Z4 > > > > 1 2 8 4 . 6 6 6 . 8 > > > > 9 9 . 4 6 6 5 > > > > 2 7 . . . 3 . . . > > > > 8 . . . 5 . . . > > > > 3 2 1 2 4 3 3 2 1 2 > > > > 2 2 4 5 6 8 6 > > > > [...] > > > > > I have several datasets like this and they are big. Furthermore, I > > > > don't know how many duplicates I have per ID. I have tried to solve > > > > this by splitting the dataset into many datasets, i.e. one for the > > > > first observation for each ID, one for the second observation and so > > > > on, and then add a suffix to each variable, and then merger them > > > > together. But it ended up with many steps and not so automated. I > > > > assume I could use proc transpose in some way, but I am not on > > > > friendly terms with proc transpose. > > > > > To sum up, any help would be greatly appreciated. > > > > > All the best, > > > > Richard > > > > data have; > > > input ID Q X Y Z; > > > cards; > > > 1 2 6 8 4 > > > 1 8 6 9 6 > > > 1 4 6 9 6 > > > 2 7 3 8 5 > > > 3 2 3 2 5 > > > 3 1 3 2 6 > > > 3 2 2 2 8 > > > 3 4 1 4 6 > > > ; > > > > data have1; > > > set have; > > > array vv q x y z; > > > by id; > > > if first.id then n_=0; > > > n_+1; > > > do over vv; > > > colnm=compress(vname(vv)||put(n_,best.)); > > > val=vv; > > > output; > > > end; > > > keep id colnm val; > > > run; > > > > proc transpose data=have1 out=want (drop=_name_); > > > by id; > > > var val; > > > id colnm; > > > run; > > > > proc print; > > > run; > > > > ID Q1 X1 Y1 Z1 Q2 X2 Y2 Z2 Q3 X3 Y3 Z3 Q4 > > > X4 Y4 Z4 > > > > 1 2 6 8 4 8 6 9 6 4 6 9 > > > 6 . . . . > > > 2 7 3 8 > > > 5 . . . . . . . . . . . . > > > 3 2 3 2 5 1 3 2 6 2 2 2 8 4 > > > 1 4 6 > > > > HTH > > > > Ya
First
|
Prev
|
Pages: 1 2 Prev: fitting a regression line Next: Stock market competition: request for data |