From: Kevin Myers on 23 Jan 2010 14:11 I haven't really thought your FULLINFO option out thoroughly since I didn't really feel it was necessary. However, with regard to your last comments, you might want to be careful about having something like that apply to all of those options and statements at the same time. Keep in mind that it would still require TWO executions of PROC TRANSPOSE to solve the scenario that you originally posted about. One to convert the data set from short-wide to long-skinny form, and a second to convert back from long-skinny to short-wide(r). Although conceiveably possible, you wouldn't generally want to be using the ID IDLABEL IDTYPE IDLENGTH IDFORMAT and IDINFORMAT statements on the same execution of the TRANSPOSE in which you are using the NAME LABEL TYPE LENGTH FORMAT and INFORMAT options, because they wouldn't be referring to the same variables in a single execution of the PROC. Generally, for something similar to your scenario, you would run TRANPOSE the first time using the relevant OPTIONS to create the associated variables, then run TRANSPOSE a second time using the relevant IDxxx statements to turn the associated variable values back into attributes for the output variables. s/KAM ----- Original Message ----- From: "Ted Clay" <tclay(a)ashlandhome.net> To: "'Kevin Myers'" <KevinMyers(a)austin.rr.com>; <SAS-L(a)LISTSERV.UGA.EDU> Sent: Saturday, January 23, 2010 12:47 Subject: RE: SASware ballot #18 Proc Transpose preserving variable attributes > Kevin, > If there were a "FULLINFO" option in your proposal, my preference would be > to have it cover TYPE LENGTH FORMAT INFORMAT NOFORMAT NOINFORMAT on the > proc > statement, plus the statements IDTYPE IDLENGTH IDFORMAT and IDINFORMAT > with > default names. But yes one more option is one more option. > Ted > > > -----Original Message----- > From: Kevin Myers [mailto:KevinMyers(a)austin.rr.com] > Sent: Saturday, January 23, 2010 10:04 AM > To: Ted Clay; SAS-L(a)LISTSERV.UGA.EDU > Subject: Re: SASware ballot #18 Proc Transpose preserving variable > attributes > > P.S. Just in case it wasn't already obvious, with the proposed behavior, > specifying: > PROC TRANSPOSE TYPE LENGTH FORMAT INFORMAT; > would be equivalent to your proposed: > PROC TRANSPOSE FULLINFO; > Personally, I don't feel that amount of shortening (three words) deserves > yet another option, but that's just my opinion. > > ----- Original Message ----- > From: "Ted Clay" <tclay(a)ashlandhome.net> > To: "'Kevin Myers'" <KevinMyers(a)austin.rr.com>; <SAS-L(a)LISTSERV.UGA.EDU> > Sent: Saturday, January 23, 2010 11:34 > Subject: RE: SASware ballot #18 Proc Transpose preserving variable > attributes > > >> Yes, this would be necessary. In fact, this is what currently happens if >> a >> format is associated with a variable but is not available at the time of >> execution, with the NOFMTERR option turned on. >> >> In terms of default behavior, currently the output data set has _LABEL_ >> if >> and only if any VAR has a label. Would _FORMAT_, _INFORMAT_ be created >> under the same conditions, that is, if any VAR has one defined? What >> about >> _TYPE_? Every variable has a type. I guess I'm leading up to saying I >> don't think you want to be creating anything new unless the user >> specifies >> one of the added features. How about a single option to "turn on" all >> these >> features, with "NOFORMAT" and "NOINFORMAT" being the only behavior. >> Something like "FULLINFO". No other additional statements or options. >> If >> you say "FULLINFO" the proc would create or use the _FORMAT_, _INFORMAT_ >> and >> _TYPE_ variables as needed. (_TYPE_ only if not all numeric) >> >> Ted >> >> >> >> >> -----Original Message----- >> From: Kevin Myers [mailto:KevinMyers(a)austin.rr.com] >> Sent: Saturday, January 23, 2010 3:38 AM >> To: Ted Clay; SAS-L(a)LISTSERV.UGA.EDU >> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >> attributes >> >> Right Ted, these are the types of problems that my proposed enhancement >> is >> intended to solve. There is no way to prevent conversion to character on >> the first transpose, because values for multiple columns with differing >> types are being forced into a single output column. But upon transposing >> the data a second time, the proposed enhancements would generally allow >> the >> original variables types and other attributes to be restored. >> >> However, your examples raise a couple of potential problems that I failed >> to >> >> account for previously: 1) What if there is no one-to-one correspondence >> between the formatted and informatted values, or 2) what it there is a >> format but no corresponding informat or vice-versa? In these cases, my >> proposal as it currently stands would not produce the desired result, >> because there would be no way to convert the formatted values back into >> the >> original raw data values. This problem suggests the need for two >> additional >> >> options on the PROC FORMAT statement in order to completely implement the >> desired behavior: NOFORMAT and NOINFORMAT. NOFORMAT would be used to >> force >> default formats when outputting values and NOINFORMAT would be used to >> force >> >> default informats when reading values. Note that these options would >> only >> prevent non-default formats and informats from being actively *used* for >> converting values during processing, they would NOT prevent the format >> and >> informat attribute values from being written to or read from variables >> identified by using the proposed FORMAT= and INFORMAT= options or the >> IDFORMAT and IDINFORMAT statements. >> >> I will add this discussion to my proposal. >> >> s/KAM >> >> >> ----- Original Message ----- >> From: "Ted Clay" <tclay(a)ashlandhome.net> >> To: "'Kevin Myers'" <KevinMyers(a)AUSTIN.RR.COM>; <SAS-L(a)LISTSERV.UGA.EDU> >> Sent: Friday, January 22, 2010 23:57 >> Subject: RE: SASware ballot #18 Proc Transpose preserving variable >> attributes >> >> >>> Hi Kevin, >>> >>> The default behavior of Proc Transpose is making it difficult to achieve >>> the >>> preservation of variable attributes. The problem is with the values >>> themselves. >>> >>> Take a look at this: >>> >>> proc format; >>> value nfmt 1='One%' 2='Two$two' 3='%three'; >>> value $cfmt 'A'='AOne%' 'B'='BTwo$two' 'C'='%three#C'; >>> run; >>> >>> Data testing; >>> Input Group $ N C $ D mmddyy10.; >>> label N='Numeric var' >>> C='Text' >>> D='Birthday'; >>> format n nfmt. c $cfmt. D mmddyy10.; >>> Datalines; >>> G1 1 A 01/15/2010 >>> ; >>> proc transpose data=testing out=Tran; >>> var N C D; >>> run; >>> >>> proc print data=tran noobs; >>> run; >>> >>> PRODUCES: >>> >>> _NAME_ _LABEL_ COL1 >>> >>> N Numeric var One% >>> C Text A >>> D Birthday 01/15/2010 >>> >>> The problem is that the numeric variable N has been converted to >>> character >>> using the format we assigned to it. The original numeric value is lost. >>> If >>> we prevent this by removing the format, we lose the format attribute. >>> Is >>> this Catch-22.0? >>> >>> Ted >>> >>> -----Original Message----- >>> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of >>> Kevin >>> Myers >>> Sent: Thursday, January 21, 2010 1:30 PM >>> To: SAS-L(a)LISTSERV.UGA.EDU >>> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >>> attributes >>> >>> Hi Ted, >>> >>> The proposed SUGI ballot item enhancements to PROC TRANSPOSE could >>> potentially help with your process to some degree, even though >>> fundamentally >>> my proposal only deals with variable attributes, whereas your needs >>> involve >>> an actual rearrangement of the data into different observations and >>> variables. >>> >>> If you first transposed your data into "long-skinny" form using a BY >>> statement, then provided a distinct _NAME_ variable for each individual >>> observation in each BY group of the long-skinny dataset using logic of >>> your >>> own, then transposed that result back into short-wide form once again >>> using >>> a BY statement, then you could achieve roughly what you are looking for, >>> and >>> perhaps this is what you have already tried. >>> >>> The contribution that my proposal could make to this process is that you >>> would be able to use the new PROC TRANSPOSE statements and options to >>> force >>> the resulting variables in the final data set (after transposing twice) >>> to >>> have all of the same types, lengths, formats, and informats as they did >>> in >>> the original data set, rather than all being changed to character >>> variables >>> with none of the other original attributes. If maintaining the original >>> variable types and attributes is important, then my proposal could >>> substantially simplify your process. >>> >>> s/KAM >>> >>> P.S. - Hope you don't mind that I copied this reply back to the list. I >>> thought others might also be interested. >>> >>> >>> ----- Original Message ----- >>> From: "Ted Clay" <tclay(a)ashlandhome.net> >>> To: "'Kevin Myers'" <KevinMyers(a)AUSTIN.RR.COM> >>> Sent: Thursday, January 21, 2010 13:50 >>> Subject: RE: SASware ballot #18 Proc Transpose preserving variable >>> attributes >>> >>> >>>> Hi Kevin, >>>> Thanks for filling the gaps. Yes, what I am describing creates a >>>> differently >>>> shaped output data set, so is not equivalent to your proposal. Let any >>>> confusion on that point end here. >>>> >>>> The ability to create and work with a "self-defining" skinny data set >>>> would >>>> be great. >>>> >>>> The situation I run into all the time (say, weekly) is where the data >>>> is >>>> in >>>> groups, and I have a mixture of variable types: numeric, character, >>>> numeric >>>> date. I want to get all the data in a group onto one observation. I >>>> sometimes have an ID variable, or at other times just want numeric >>>> suffixes >>>> (1,2,3..). >>>> Would your proposal make that easier to do than it is now? My hunch is >>>> yes, >>>> but it would not be a slam-dunk: >>>> Proc transpose, with the additional options you propose on the Proc >>>> statement. >>>> Data step to calculate new variable names, and possibly labels. >>>> (Calculate >>>> numeric suffixes?) >>>> Proc transpose, with the additional ID statements you propose. >>>> >>>> I had a typo on my example -- I meant to type "ONEOUT" not "OUTOUT". I >>>> guess my dog "Spot" was tugging on my pants-leg. >>>> Proc transpose data=testing out=trans ONEOUT; >>>> By group; >>>> Var N D; >>>> ID C; >>>> Run; >>>> >>>> Ted >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of >>>> Kevin >>>> Myers >>>> Sent: Thursday, January 21, 2010 9:27 AM >>>> To: SAS-L(a)LISTSERV.UGA.EDU >>>> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >>>> attributes >>>> >>>> Ted and all - >>>> >>>> The current item on the SASware ballot was put on there largely based >>>> on >>>> my >>>> input. I provided a detailed description of the suggested enhancements >>>> via >>>> SAS Institute's online Tracks system: >>>> >>>> 7610240654 - Base SAS: Additional options and statements for PROC >>>> TRANSPOSE >>>> to support transposing additional variable attributes, including >>>> format, >>>> informat, type, and length. >>>> >>>> I don't know whether you will be able to view that, but am short on >>>> time >>>> at >>>> the moment, so below is an excerpt from my original submission. Please >>>> note >>>> that my proposal does NOT require any changes to existing PROC >>>> TRANSPOSE >>>> behavior, but rather adds support for additional options and statements >>>> to >>>> support handling additional variable attributes analogous to the >>>> existing >>>> support for variable lablels. >>>> >>>> I haven't had time to review your suggestion in detail, but after >>>> briefly >>>> scanning through it I believe that you are asking for something that is >>>> substantially different in concept. If so, then I would suggest that >>>> you >>>> submit a separate enhancement suggestion, because I wouldn't want your >>>> proposal to be confused with or complicate the implementation of mine, >>>> or >>>> vice-versa. >>>> >>>> Hope this helps. >>>> Kevin M. >>>> >>>> from: kevinmyers(a)austin.rr.com >>>> name: Kevin A. Myers >>>> company: Myers Engineering >>>> product: BASE >>>> os: none >>>> suggestion text: PROC TRANSPOSE should be enhanced to support >>>> transposing >>>> additional variable attributes including format, >>>> informat, type, and length. >>>> >>>> The following new STATEMENTS should be added. These are similar to the >>>> existing idlabel statement, but provide input variable >>>> names for the other variable attributes: >>>> >>>> idformat statement - variable name to provide formats >>>> idinformat statement - variable name to provide informats >>>> idtype statement - variable name to provide variable types >>>> idlength statement - variable name to provide variable lengths >>>> >>>> The PROC TRANSPOSE statement should be enhanced to add the following >>>> additional OPTIONS, which all perform similar functions to the >>>> existing label option, but for the other variable attributes: >>>> >>>> format - output variable name for formats (default _FORMAT_) >>>> informat - output variable name for informats (default _INFORMAT_) >>>> type - output variable name for types (default _TYPE_) >>>> length - output variable name for lengths (default _LENGTH_) >>>> >>>> Providing the new statements and options listed above would allow for >>>> simple, complete transformation of a data set from >>>> "long-skinny" to "short-wide" representations AND BACK AGAIN, or >>>> vice-versa, >>>> WITHOUT LOSS OF ANY FORMAT, INFORMAT, TYPE, and LENGTH >>>> information. It would also greatly simplify >>>> input and output of data from/to "self-defining" external file formats >>>> where >>>> the format, informat, type, and/or length attributes >>>> are included within the external data file. >>>> >>>> Availability of the proposed new statements and options would allow >>>> simple >>>> loops based on "long-skinny" data representation >>>> to be used for reading/writing data in self-defining formats, >>>> followed/preceded by proc transpose steps to convert the data to/from a >>>> more >>>> typical "short-wide" representation. >>>> >>>> Code to implement the proposed statements and options should be >>>> relatively >>>> easy to develop by simply duplicating the existing >>>> support for IDLABEL/LABEL= and making relatively minor modifications as >>>> needed for each of the other specific attributes (format, >>>> informat, type, length). >>>> >>>> >>>> ----- Original Message ----- >>>> From: "Ted Clay" <tclay(a)ASHLANDHOME.NET> >>>> To: <SAS-L(a)LISTSERV.UGA.EDU> >>>> Sent: Thursday, January 21, 2010 09:33 >>>> Subject: SASware ballot #18 Proc Transpose preserving variable >>>> attributes >>>> >>>> >>>>> SASware ballot #18 refers to Proc Transpose preserving variable >>>>> attributes. >>>>> >>>>> I'm familiar with the problem and frequently use a macro to solve it, >>>>> but >>>>> would be delighted to see this feature migrate into Base SAS. >>>>> >>>>> >>>>> >>>>> I am soliciting input from SAS-L in the hopes that this change will be >>>>> implemented in the most useful way. >>>>> >>>>> Posting your comments may have some influence on a future enhancement >>>>> of >>>>> Proc Transpose. >>>>> >>>>> >>>>> >>>>> First off, I think this requires limiting the output to one >>>>> observation >>>>> per >>>>> BY-group. I can't see any way around that. >>>>> >>>>> >>>>> >>>>> Assuming that is true, one idea for implementing the change to Proc >>>>> Transpose would be to add a "ONEOUT" option on the proc statement. >>>>> >>>>> >>>>> >>>>> Example: >>>>> >>>>> Data testing; >>>>> >>>>> Input Group N C $ D mmddyy10.; >>>>> >>>>> format d mmddyy10.; >>>>> >>>>> label N='Numeric var' >>>>> >>>>> C='Text' >>>>> >>>>> D='Birthday'; >>>>> >>>>> Datalines; >>>>> >>>>> 1 57 A 01/15/2010 >>>>> >>>>> 1 84 B 02/15/2010 >>>>> >>>>> 2 33 B 03/15/2010 >>>>> >>>>> ; >>>>> >>>>> Proc transpose data=testing out=trans OUTOUT; >>>>> >>>>> By group; >>>>> >>>>> Var N D; >>>>> >>>>> Run; >>>>> >>>>> >>>>> >>>>> Proc print data=trans; >>>>> >>>>> Would produce: >>>>> >>>>> Obs Group N1 D1 N2 D2 >>>>> >>>>> >>>>> >>>>> 1 1 57 01/15/2010 84 02/15/2010 >>>>> >>>>> 2 2 33 03/15/2010 >>>>> >>>>> >>>>> >>>>> With the ID statement: >>>>> >>>>> Proc transpose data=testing out=trans OUTOUT; >>>>> >>>>> By group; >>>>> >>>>> Var N D; >>>>> >>>>> ID C; >>>>> >>>>> Run; >>>>> >>>>> >>>>> >>>>> Would produce: >>>>> >>>>> Obs Group NA DA NB DB >>>>> >>>>> >>>>> >>>>> 1 1 57 01/15/2010 84 02/15/2010 >>>>> >>>>> 2 2 33 03/15/2010 >>>>> >>>>> >>>>> >>>>> Other possible options come up in naming the output variables: (1) ID >>>>> values >>>>> before names (2) a separator between ID values and names, and (3) >>>>> whether >>>>> and how to modify labels. >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> >> >> >> > > > > >
From: Ted Clay on 24 Jan 2010 14:15 Kevin, Under the circumstance of transposing both numeric and character variables, which forces a numeric-to-character conversion, some numeric values are converted with loss of information. So "NOFORMAT" would be necessary to prevent this. Experimenting with the largest possible integer in SAS (9,007,199,254,740,992), it seems to be converted using a default format of BEST12, and its original value can't be recovered. You may want state explicitly in your proposal, with "NOFORMAT", not only would any associated format be ignored, but no other format would be introduced which would cause a loss of numeric precision. In my opinion, NOFORMAT should be a feature of Proc Transpose no matter what. Here is the demo: data test; c='123456789012346789012345'; N1=9007199254740992; run; proc transpose data=test out=trans; var c n1; run; proc print; PRODUCES: Obs _NAME_ COL1 1 c 123456789012346789012345 2 N1 9.0071993E15 On a related aspect of "preserving variable attributes" in Proc Transpose as it is now, I notice that when all the variables on the VAR statement have the same format, this gets preserved. Length is also handled well. But informats and labels are not. Even identical labels are ignored. The problem with INFORMATs may be an oversight on SAS's part, and ignoring identical labels is also kind of dumb. I think whenever any attribute is the same on all variables on the VAR statement, they should carry through. You are correct that if one turns on all the features in your proposal using a "FULLINFO" option, you end up with variables you don't always need, and would have to drop them. All I can say is that if it were a feature, I would use it. Ted -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Kevin Myers Sent: Saturday, January 23, 2010 11:11 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: SASware ballot #18 Proc Transpose preserving variable attributes I haven't really thought your FULLINFO option out thoroughly since I didn't really feel it was necessary. However, with regard to your last comments, you might want to be careful about having something like that apply to all of those options and statements at the same time. Keep in mind that it would still require TWO executions of PROC TRANSPOSE to solve the scenario that you originally posted about. One to convert the data set from short-wide to long-skinny form, and a second to convert back from long-skinny to short-wide(r). Although conceiveably possible, you wouldn't generally want to be using the ID IDLABEL IDTYPE IDLENGTH IDFORMAT and IDINFORMAT statements on the same execution of the TRANSPOSE in which you are using the NAME LABEL TYPE LENGTH FORMAT and INFORMAT options, because they wouldn't be referring to the same variables in a single execution of the PROC. Generally, for something similar to your scenario, you would run TRANPOSE the first time using the relevant OPTIONS to create the associated variables, then run TRANSPOSE a second time using the relevant IDxxx statements to turn the associated variable values back into attributes for the output variables. s/KAM ----- Original Message ----- From: "Ted Clay" <tclay(a)ashlandhome.net> To: "'Kevin Myers'" <KevinMyers(a)austin.rr.com>; <SAS-L(a)LISTSERV.UGA.EDU> Sent: Saturday, January 23, 2010 12:47 Subject: RE: SASware ballot #18 Proc Transpose preserving variable attributes > Kevin, > If there were a "FULLINFO" option in your proposal, my preference would be > to have it cover TYPE LENGTH FORMAT INFORMAT NOFORMAT NOINFORMAT on the > proc > statement, plus the statements IDTYPE IDLENGTH IDFORMAT and IDINFORMAT > with > default names. But yes one more option is one more option. > Ted > > > -----Original Message----- > From: Kevin Myers [mailto:KevinMyers(a)austin.rr.com] > Sent: Saturday, January 23, 2010 10:04 AM > To: Ted Clay; SAS-L(a)LISTSERV.UGA.EDU > Subject: Re: SASware ballot #18 Proc Transpose preserving variable > attributes > > P.S. Just in case it wasn't already obvious, with the proposed behavior, > specifying: > PROC TRANSPOSE TYPE LENGTH FORMAT INFORMAT; > would be equivalent to your proposed: > PROC TRANSPOSE FULLINFO; > Personally, I don't feel that amount of shortening (three words) deserves > yet another option, but that's just my opinion. > > ----- Original Message ----- > From: "Ted Clay" <tclay(a)ashlandhome.net> > To: "'Kevin Myers'" <KevinMyers(a)austin.rr.com>; <SAS-L(a)LISTSERV.UGA.EDU> > Sent: Saturday, January 23, 2010 11:34 > Subject: RE: SASware ballot #18 Proc Transpose preserving variable > attributes > > >> Yes, this would be necessary. In fact, this is what currently happens if >> a >> format is associated with a variable but is not available at the time of >> execution, with the NOFMTERR option turned on. >> >> In terms of default behavior, currently the output data set has _LABEL_ >> if >> and only if any VAR has a label. Would _FORMAT_, _INFORMAT_ be created >> under the same conditions, that is, if any VAR has one defined? What >> about >> _TYPE_? Every variable has a type. I guess I'm leading up to saying I >> don't think you want to be creating anything new unless the user >> specifies >> one of the added features. How about a single option to "turn on" all >> these >> features, with "NOFORMAT" and "NOINFORMAT" being the only behavior. >> Something like "FULLINFO". No other additional statements or options. >> If >> you say "FULLINFO" the proc would create or use the _FORMAT_, _INFORMAT_ >> and >> _TYPE_ variables as needed. (_TYPE_ only if not all numeric) >> >> Ted >> >> >> >> >> -----Original Message----- >> From: Kevin Myers [mailto:KevinMyers(a)austin.rr.com] >> Sent: Saturday, January 23, 2010 3:38 AM >> To: Ted Clay; SAS-L(a)LISTSERV.UGA.EDU >> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >> attributes >> >> Right Ted, these are the types of problems that my proposed enhancement >> is >> intended to solve. There is no way to prevent conversion to character on >> the first transpose, because values for multiple columns with differing >> types are being forced into a single output column. But upon transposing >> the data a second time, the proposed enhancements would generally allow >> the >> original variables types and other attributes to be restored. >> >> However, your examples raise a couple of potential problems that I failed >> to >> >> account for previously: 1) What if there is no one-to-one correspondence >> between the formatted and informatted values, or 2) what it there is a >> format but no corresponding informat or vice-versa? In these cases, my >> proposal as it currently stands would not produce the desired result, >> because there would be no way to convert the formatted values back into >> the >> original raw data values. This problem suggests the need for two >> additional >> >> options on the PROC FORMAT statement in order to completely implement the >> desired behavior: NOFORMAT and NOINFORMAT. NOFORMAT would be used to >> force >> default formats when outputting values and NOINFORMAT would be used to >> force >> >> default informats when reading values. Note that these options would >> only >> prevent non-default formats and informats from being actively *used* for >> converting values during processing, they would NOT prevent the format >> and >> informat attribute values from being written to or read from variables >> identified by using the proposed FORMAT= and INFORMAT= options or the >> IDFORMAT and IDINFORMAT statements. >> >> I will add this discussion to my proposal. >> >> s/KAM >> >> >> ----- Original Message ----- >> From: "Ted Clay" <tclay(a)ashlandhome.net> >> To: "'Kevin Myers'" <KevinMyers(a)AUSTIN.RR.COM>; <SAS-L(a)LISTSERV.UGA.EDU> >> Sent: Friday, January 22, 2010 23:57 >> Subject: RE: SASware ballot #18 Proc Transpose preserving variable >> attributes >> >> >>> Hi Kevin, >>> >>> The default behavior of Proc Transpose is making it difficult to achieve >>> the >>> preservation of variable attributes. The problem is with the values >>> themselves. >>> >>> Take a look at this: >>> >>> proc format; >>> value nfmt 1='One%' 2='Two$two' 3='%three'; >>> value $cfmt 'A'='AOne%' 'B'='BTwo$two' 'C'='%three#C'; >>> run; >>> >>> Data testing; >>> Input Group $ N C $ D mmddyy10.; >>> label N='Numeric var' >>> C='Text' >>> D='Birthday'; >>> format n nfmt. c $cfmt. D mmddyy10.; >>> Datalines; >>> G1 1 A 01/15/2010 >>> ; >>> proc transpose data=testing out=Tran; >>> var N C D; >>> run; >>> >>> proc print data=tran noobs; >>> run; >>> >>> PRODUCES: >>> >>> _NAME_ _LABEL_ COL1 >>> >>> N Numeric var One% >>> C Text A >>> D Birthday 01/15/2010 >>> >>> The problem is that the numeric variable N has been converted to >>> character >>> using the format we assigned to it. The original numeric value is lost. >>> If >>> we prevent this by removing the format, we lose the format attribute. >>> Is >>> this Catch-22.0? >>> >>> Ted >>> >>> -----Original Message----- >>> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of >>> Kevin >>> Myers >>> Sent: Thursday, January 21, 2010 1:30 PM >>> To: SAS-L(a)LISTSERV.UGA.EDU >>> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >>> attributes >>> >>> Hi Ted, >>> >>> The proposed SUGI ballot item enhancements to PROC TRANSPOSE could >>> potentially help with your process to some degree, even though >>> fundamentally >>> my proposal only deals with variable attributes, whereas your needs >>> involve >>> an actual rearrangement of the data into different observations and >>> variables. >>> >>> If you first transposed your data into "long-skinny" form using a BY >>> statement, then provided a distinct _NAME_ variable for each individual >>> observation in each BY group of the long-skinny dataset using logic of >>> your >>> own, then transposed that result back into short-wide form once again >>> using >>> a BY statement, then you could achieve roughly what you are looking for, >>> and >>> perhaps this is what you have already tried. >>> >>> The contribution that my proposal could make to this process is that you >>> would be able to use the new PROC TRANSPOSE statements and options to >>> force >>> the resulting variables in the final data set (after transposing twice) >>> to >>> have all of the same types, lengths, formats, and informats as they did >>> in >>> the original data set, rather than all being changed to character >>> variables >>> with none of the other original attributes. If maintaining the original >>> variable types and attributes is important, then my proposal could >>> substantially simplify your process. >>> >>> s/KAM >>> >>> P.S. - Hope you don't mind that I copied this reply back to the list. I >>> thought others might also be interested. >>> >>> >>> ----- Original Message ----- >>> From: "Ted Clay" <tclay(a)ashlandhome.net> >>> To: "'Kevin Myers'" <KevinMyers(a)AUSTIN.RR.COM> >>> Sent: Thursday, January 21, 2010 13:50 >>> Subject: RE: SASware ballot #18 Proc Transpose preserving variable >>> attributes >>> >>> >>>> Hi Kevin, >>>> Thanks for filling the gaps. Yes, what I am describing creates a >>>> differently >>>> shaped output data set, so is not equivalent to your proposal. Let any >>>> confusion on that point end here. >>>> >>>> The ability to create and work with a "self-defining" skinny data set >>>> would >>>> be great. >>>> >>>> The situation I run into all the time (say, weekly) is where the data >>>> is >>>> in >>>> groups, and I have a mixture of variable types: numeric, character, >>>> numeric >>>> date. I want to get all the data in a group onto one observation. I >>>> sometimes have an ID variable, or at other times just want numeric >>>> suffixes >>>> (1,2,3..). >>>> Would your proposal make that easier to do than it is now? My hunch is >>>> yes, >>>> but it would not be a slam-dunk: >>>> Proc transpose, with the additional options you propose on the Proc >>>> statement. >>>> Data step to calculate new variable names, and possibly labels. >>>> (Calculate >>>> numeric suffixes?) >>>> Proc transpose, with the additional ID statements you propose. >>>> >>>> I had a typo on my example -- I meant to type "ONEOUT" not "OUTOUT". I >>>> guess my dog "Spot" was tugging on my pants-leg. >>>> Proc transpose data=testing out=trans ONEOUT; >>>> By group; >>>> Var N D; >>>> ID C; >>>> Run; >>>> >>>> Ted >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of >>>> Kevin >>>> Myers >>>> Sent: Thursday, January 21, 2010 9:27 AM >>>> To: SAS-L(a)LISTSERV.UGA.EDU >>>> Subject: Re: SASware ballot #18 Proc Transpose preserving variable >>>> attributes >>>> >>>> Ted and all - >>>> >>>> The current item on the SASware ballot was put on there largely based >>>> on >>>> my >>>> input. I provided a detailed description of the suggested enhancements >>>> via >>>> SAS Institute's online Tracks system: >>>> >>>> 7610240654 - Base SAS: Additional options and statements for PROC >>>> TRANSPOSE >>>> to support transposing additional variable attributes, including >>>> format, >>>> informat, type, and length. >>>> >>>> I don't know whether you will be able to view that, but am short on >>>> time >>>> at >>>> the moment, so below is an excerpt from my original submission. Please >>>> note >>>> that my proposal does NOT require any changes to existing PROC >>>> TRANSPOSE >>>> behavior, but rather adds support for additional options and statements >>>> to >>>> support handling additional variable attributes analogous to the >>>> existing >>>> support for variable lablels. >>>> >>>> I haven't had time to review your suggestion in detail, but after >>>> briefly >>>> scanning through it I believe that you are asking for something that is >>>> substantially different in concept. If so, then I would suggest that >>>> you >>>> submit a separate enhancement suggestion, because I wouldn't want your >>>> proposal to be confused with or complicate the implementation of mine, >>>> or >>>> vice-versa. >>>> >>>> Hope this helps. >>>> Kevin M. >>>> >>>> from: kevinmyers(a)austin.rr.com >>>> name: Kevin A. Myers >>>> company: Myers Engineering >>>> product: BASE >>>> os: none >>>> suggestion text: PROC TRANSPOSE should be enhanced to support >>>> transposing >>>> additional variable attributes including format, >>>> informat, type, and length. >>>> >>>> The following new STATEMENTS should be added. These are similar to the >>>> existing idlabel statement, but provide input variable >>>> names for the other variable attributes: >>>> >>>> idformat statement - variable name to provide formats >>>> idinformat statement - variable name to provide informats >>>> idtype statement - variable name to provide variable types >>>> idlength statement - variable name to provide variable lengths >>>> >>>> The PROC TRANSPOSE statement should be enhanced to add the following >>>> additional OPTIONS, which all perform similar functions to the >>>> existing label option, but for the other variable attributes: >>>> >>>> format - output variable name for formats (default _FORMAT_) >>>> informat - output variable name for informats (default _INFORMAT_) >>>> type - output variable name for types (default _TYPE_) >>>> length - output variable name for lengths (default _LENGTH_) >>>> >>>> Providing the new statements and options listed above would allow for >>>> simple, complete transformation of a data set from >>>> "long-skinny" to "short-wide" representations AND BACK AGAIN, or >>>> vice-versa, >>>> WITHOUT LOSS OF ANY FORMAT, INFORMAT, TYPE, and LENGTH >>>> information. It would also greatly simplify >>>> input and output of data from/to "self-defining" external file formats >>>> where >>>> the format, informat, type, and/or length attributes >>>> are included within the external data file. >>>> >>>> Availability of the proposed new statements and options would allow >>>> simple >>>> loops based on "long-skinny" data representation >>>> to be used for reading/writing data in self-defining formats, >>>> followed/preceded by proc transpose steps to convert the data to/from a >>>> more >>>> typical "short-wide" representation. >>>> >>>> Code to implement the proposed statements and options should be >>>> relatively >>>> easy to develop by simply duplicating the existing >>>> support for IDLABEL/LABEL= and making relatively minor modifications as >>>> needed for each of the other specific attributes (format, >>>> informat, type, length). >>>> >>>> >>>> ----- Original Message ----- >>>> From: "Ted Clay" <tclay(a)ASHLANDHOME.NET> >>>> To: <SAS-L(a)LISTSERV.UGA.EDU> >>>> Sent: Thursday, January 21, 2010 09:33 >>>> Subject: SASware ballot #18 Proc Transpose preserving variable >>>> attributes >>>> >>>> >>>>> SASware ballot #18 refers to Proc Transpose preserving variable >>>>> attributes. >>>>> >>>>> I'm familiar with the problem and frequently use a macro to solve it, >>>>> but >>>>> would be delighted to see this feature migrate into Base SAS. >>>>> >>>>> >>>>> >>>>> I am soliciting input from SAS-L in the hopes that this change will be >>>>> implemented in the most useful way. >>>>> >>>>> Posting your comments may have some influence on a future enhancement >>>>> of >>>>> Proc Transpose. >>>>> >>>>> >>>>> >>>>> First off, I think this requires limiting the output to one >>>>> observation >>>>> per >>>>> BY-group. I can't see any way around that. >>>>> >>>>> >>>>> >>>>> Assuming that is true, one idea for implementing the change to Proc >>>>> Transpose would be to add a "ONEOUT" option on the proc statement. >>>>> >>>>> >>>>> >>>>> Example: >>>>> >>>>> Data testing; >>>>> >>>>> Input Group N C $ D mmddyy10.; >>>>> >>>>> format d mmddyy10.; >>>>> >>>>> label N='Numeric var' >>>>> >>>>> C='Text' >>>>> >>>>> D='Birthday'; >>>>> >>>>> Datalines; >>>>> >>>>> 1 57 A 01/15/2010 >>>>> >>>>> 1 84 B 02/15/2010 >>>>> >>>>> 2 33 B 03/15/2010 >>>>> >>>>> ; >>>>> >>>>> Proc transpose data=testing out=trans OUTOUT; >>>>> >>>>> By group; >>>>> >>>>> Var N D; >>>>> >>>>> Run; >>>>> >>>>> >>>>> >>>>> Proc print data=trans; >>>>> >>>>> Would produce: >>>>> >>>>> Obs Group N1 D1 N2 D2 >>>>> >>>>> >>>>> >>>>> 1 1 57 01/15/2010 84 02/15/2010 >>>>> >>>>> 2 2 33 03/15/2010 >>>>> >>>>> >>>>> >>>>> With the ID statement: >>>>> >>>>> Proc transpose data=testing out=trans OUTOUT; >>>>> >>>>> By group; >>>>> >>>>> Var N D; >>>>> >>>>> ID C; >>>>> >>>>> Run; >>>>> >>>>> >>>>> >>>>> Would produce: >>>>> >>>>> Obs Group NA DA NB DB >>>>> >>>>> >>>>> >>>>> 1 1 57 01/15/2010 84 02/15/2010 >>>>> >>>>> 2 2 33 03/15/2010 >>>>> >>>>> >>>>> >>>>> Other possible options come up in naming the output variables: (1) ID >>>>> values >>>>> before names (2) a separator between ID values and names, and (3) >>>>> whether >>>>> and how to modify labels. >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> >> >> >> > > > > >
From: Kevin Myers on 24 Jan 2010 14:50 I am adding your latest comments to the proposal so that they can be considered by SI. Note that the largest integer which can be exactly represented by SAS is platform-specific. If they choose to implement this suggestion, they could select a BEST. width that is wide enough to handle the maximum precision of all available platforms. ----- Original Message ----- From: "Ted Clay" <tclay(a)ashlandhome.net> To: "'Kevin Myers'" <KevinMyers(a)AUSTIN.RR.COM>; <SAS-L(a)LISTSERV.UGA.EDU> Sent: Sunday, January 24, 2010 13:15 Subject: RE: SASware ballot #18 Proc Transpose preserving variable attributes > Kevin, > Under the circumstance of transposing both numeric and character > variables, > which forces a numeric-to-character conversion, some numeric values are > converted with loss of information. So "NOFORMAT" would be necessary to > prevent this. Experimenting with the largest possible integer in SAS > (9,007,199,254,740,992), it seems to be converted using a default format > of > BEST12, and its original value can't be recovered. You may want state > explicitly in your proposal, with "NOFORMAT", not only would any > associated > format be ignored, but no other format would be introduced which would > cause > a loss of numeric precision. In my opinion, NOFORMAT should be a feature > of > Proc Transpose no matter what. > > Here is the demo: > data test; > c='123456789012346789012345'; > N1=9007199254740992; > run; > proc transpose data=test out=trans; > var c n1; > run; > proc print; > PRODUCES: > Obs _NAME_ COL1 > > 1 c 123456789012346789012345 > 2 N1 9.0071993E15 > > On a related aspect of "preserving variable attributes" in Proc Transpose > as > it is now, I notice that when all the variables on the VAR statement have > the same format, this gets preserved. Length is also handled well. But > informats and labels are not. Even identical labels are ignored. The > problem with INFORMATs may be an oversight on SAS's part, and ignoring > identical labels is also kind of dumb. I think whenever any attribute is > the same on all variables on the VAR statement, they should carry through. > > You are correct that if one turns on all the features in your proposal > using > a "FULLINFO" option, you end up with variables you don't always need, and > would have to drop them. All I can say is that if it were a feature, I > would use it. > > Ted > >
From: Kevin Myers on 25 Jan 2010 16:23 Oh come on Null... As mentioned in the message, the example program is merely an overly-simplistic test case strictly intended to illustrate and exercise all of the necessary features. Any real application would include additional logic between the two transpose steps, or read the desired attributes values from a "self-defining" external file. The real trick is for you to simulate the *second* transpose step with comparably simple code of your own. For you to say that the proposed TRANSPOSE enhancements are not needed, that is what you must be able to accomplish. So, are you up to the challenge? ----- Original Message ----- From: "Data _null_;" <iebupdte(a)GMAIL.COM> To: <SAS-L(a)LISTSERV.UGA.EDU> Sent: Monday, January 25, 2010 15:12 Subject: Re: SASware ballot #18 Proc Transpose preserving variable attributes > On 1/25/10, Kevin Myers <KevinMyers(a)austin.rr.com> wrote: >> proc transpose data=original out=skinny type length format informat force >> noformat; by obsno; >> var _all_; run; >> >> proc transpose data=skinny out=final noinformat; by obsno; >> id _NAME_; idlabel _LABEL_; idtype _TYPE_; idlength _LENGTH_; idformat >> _FORMAT_; idinformat _INFORMAT_; >> var col1; run; >> >> If you can manage to accomplish something similar with code that is >> anywhere >> remotely close to something this simple, then you win a (virtual) cookie. > > Seems obvious: data=original >
From: "Data _null_;" on 25 Jan 2010 16:12
On 1/25/10, Kevin Myers <KevinMyers(a)austin.rr.com> wrote: > proc transpose data=original out=skinny type length format informat force > noformat; by obsno; > var _all_; run; > > proc transpose data=skinny out=final noinformat; by obsno; > id _NAME_; idlabel _LABEL_; idtype _TYPE_; idlength _LENGTH_; idformat > _FORMAT_; idinformat _INFORMAT_; > var col1; run; > > If you can manage to accomplish something similar with code that is anywhere > remotely close to something this simple, then you win a (virtual) cookie. Seems obvious: data=original |