Prev: Which to use, SAS or SUDAAN?
Next: PROC OPTMODEL: does it solve Mixed Integer Programming problems?
From: Mike Rhoads on 17 Jan 2007 11:20 Venky, Interesting point! I don't mind the behavior you are describing (ignoring leading zeroes in otherwise-valid date values) at all. However, my issue is different. In your examples a and c below, the entire contents of the specified 9-column (for a) and 11-column (for c) widths are valid, once the leading zeroes are ignored (and the system assumes a 2-digit year if necessary). The ABCDs are irrelevant, because they are past the end of the number of columns SAS was told to process. In b, on the other hand, the "invalid" characters are within the specified field width. The behavior still seems undesirable to me, as well as inconsistent with the way other informats seem to work. Off the top of my head I can't think of a situation where the current behavior of ignoring invalid characters is useful, but maybe someone else can come up with an example. Mike Rhoads Westat RhoadsM1(a)Westat.com -----Original Message----- From: Venky Chakravarthy [mailto:swovcc(a)HOTMAIL.COM] Sent: Wednesday, January 17, 2007 10:09 AM To: SAS-L(a)LISTSERV.UGA.EDU; Mike Rhoads Subject: Re: Safe way to test if a date is valid ? Hi Mike, That was my feeling too. However, going back a few years on this list, I seem to recall another thread that was similar and I think I can apply that explanation here to make some sense of it. Note that the documentation is clear on stating that leading zeroes do not affect numeric values. So when you write input('1JAN5ABCD',DATE9.) it is similar to a and c below when viewed in the context of the YEARCUTOFF option: data _null_ ; a = input ('01JAN0005ABCD',DATE9.) ; b = input ('1JAN5ABCD',DATE9.) ; c = input ('0001JAN0005ABCD',DATE11.) ; put a=date9. b=date9. c=date9. ; run ; Which yields the following in the log: a=01JAN2005 b=01JAN2005 c=01JAN2005 I agree that this behavior has a strange feel to it but I think it is also instrumental in rendering some of the flexibility to SAS in reading in date values. I guess there is more good than bad coming out of this behavior so it may be more of a feature than a bug. Here is my response to a similar question from a previous thread in 2001 and the link to a SAS Note that details this behavior. http://listserv.uga.edu/cgi-bin/wa?A2=ind0112B&L=sas-l&P=R6447 http://support.sas.com/techsup/unotes/V6/0/0573.html Regards, Venky On Wed, 17 Jan 2007 08:33:10 -0500, Mike Rhoads <RHOADSM1(a)WESTAT.COM> wrote: >Scott, > >This seems odd to me as well. > >INPUT ('123XYZ',6.) returns an error, even though within the specified >width there is a valid numeric value prior to the invalid characters. >Off the top of my head, I can't think of any good reason why INPUT >('1JAN5ABCD',DATE9.) should not return an error. The underlying >situation appears to be the same. (Trailing blanks should be OK, of >course.) > >Mike Rhoads >Westat >RhoadsM1(a)Westat.com > >-----Original Message----- >From: owner-sas-l(a)listserv.uga.edu [mailto:owner-sas-l(a)listserv.uga.edu] >On Behalf Of Scott Barry >Sent: Tuesday, January 16, 2007 5:20 PM >To: SAS-L ListServ Group >Subject: Re: Safe way to test if a date is valid ? > > >... > >Also, I would consider the SAS DATE INFORMAT processing of >alpha-characters in the year to be an >serious defect without question. Hopefully SAS Institute can understand >why, presuming someone >reported the behavior? > >For many years I've used the DATA step technique with INPUT function and >checking _ERROR_ to >validate a user-specified date string as being correct. To see Dan's >post is disquieting. > >Sincerely, > >Scott Barry >SBBWorks, Inc.
From: Ian Whitlock on 17 Jan 2007 11:42 Summary: I see it as a philosophical language problem. #iw-value=1 Mike, I seem to remember you sitting beside me in a SUGI BOF some years ago when Rick Langston explained that the various date informats were, by design, triggered to end when a non-date symbol occurred. I agree that INPUT ('1JAN5ABCD',DATE9.) is a little hard to swallow, but informats are for reading data where it is somewhat easier to understand. Since you find a blank acceptable, what do you say to INPUT ('1JAN5 BCD',DATE9.) I find it hard to argue with 1 data _null_ ; 2 input chk $15. @1 dt datetime. ; 3 put chk= dt= datetime20. ; 4 cards ; chk=1JAN5,15:35:40 dt=01JAN2005:15:35:40 Perhaps one should have an agreed small set of allowable symbols that can trigger the end of a date, but it seems somewhat like a "mother may I" approach to the language which SAS has not traditionally adopted. I think there are arguments for a permissive language and for a very strict "you conform or else" language, but SAS has already made that decision. The handling of date, time and date/time informats is consistent with that decision, albeit not so well documented. Ian Whitlock ================= Date: Wed, 17 Jan 2007 08:33:10 -0500 Reply-To: Mike Rhoads <RHOADSM1(a)WESTAT.COM> Sender: "SAS(r) Discussion" From: Mike Rhoads <RHOADSM1(a)WESTAT.COM> Subject: Re: Safe way to test if a date is valid ? In-Reply-To: <011c01c739bc$69539a30$6600a8c0(a)IBMA9A4F058C42> Content-Type: text/plain; charset="us-ascii" Scott, This seems odd to me as well. INPUT ('123XYZ',6.) returns an error, even though within the specified width there is a valid numeric value prior to the invalid characters. Off the top of my head, I can't think of any good reason why INPUT ('1JAN5ABCD',DATE9.) should not return an error. The underlying situation appears to be the same. (Trailing blanks should be OK, of course.) Mike Rhoads Westat RhoadsM1(a)Westat.com -----Original Message----- From: owner-sas-l [mailto:owner-sas-l] On Behalf Of Scott Barry Sent: Tuesday, January 16, 2007 5:20 PM To: SAS-L ListServ Group Subject: Re: Safe way to test if a date is valid ? .... Also, I would consider the SAS DATE INFORMAT processing of alpha-characters in the year to be an serious defect without question. Hopefully SAS Institute can understand why, presuming someone reported the behavior? For many years I've used the DATA step technique with INPUT function and checking _ERROR_ to validate a user-specified date string as being correct. To see Dan's post is disquieting. Sincerely, Scott Barry SBBWorks, Inc.
From: Mike Rhoads on 17 Jan 2007 14:53 Ian, What a memory!!! ;-) I certainly have no reason to doubt your recollection. The "end at a non-date symbol" paradigm certainly explains this behavior, and Rick would be the person to know. And, given how long this has been the case, it is probably unlikely to change now. Being the stubborn fellow that I am, however, I can still see no good reason why trailing alpha characters should produce "valid" results with date informats, when they don't with simple numeric formats. For instance: DATA _NULL_; INPUT @1 D DATE11. @15 N 6. @22 X $CHAR1.; PUT N=6. D=YYMMDD10.; CARDS; 15APR2007 123 X 15APR2007XYZ 123XYZ X RUN; In other words, SAS could have a "permissive" philosophy where the idea is to come up with acceptable input whenever there is some remotely plausible way of doing so. In some respects it does, such as allowing leading and trailing blanks. But here, with trailing "invalid" characters other than blanks, the behavior with dates does not seem to be consistent with that used with non-date values. I would also reject your '1JAN5 BCD', if SAS is instructed to read 9 characters. There is, of course, a difference between "formatted" and "list" input -- with the latter, it's certainly acceptable to read fewer characters than specified if a field delimiter (say a blank or comma, depending on context) is encountered. I'd argue, however, that the INPUT function implies "formatted" input, where SAS should read and act upon the full number of characters specified (or implied) with the format, unless the string is shorter than that. BTW, to make things even a little stranger, Venky found the following in the documentation: "SAS can read date and time values that are delimited by the following characters: ! # $ % & ( ) * + - . / : ; < = > ? [ \ ] ^ _ { | } ~ The blank character can also be used." (http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm ) So, INPUT('15(APR(2007',DATE11.) returns a valid date! Mike Rhoads Westat RhoadsM1(a)Westat.com -----Original Message----- From: owner-sas-l(a)listserv.uga.edu [mailto:owner-sas-l(a)listserv.uga.edu] On Behalf Of iw1junk(a)comcast.net Sent: Wednesday, January 17, 2007 11:42 AM To: SAS(r) Discussion Cc: Mike Rhoads; Scott Barry Subject: Re: Safe way to test if a date is valid ? Summary: I see it as a philosophical language problem. #iw-value=1 Mike, I seem to remember you sitting beside me in a SUGI BOF some years ago when Rick Langston explained that the various date informats were, by design, triggered to end when a non-date symbol occurred. I agree that INPUT ('1JAN5ABCD',DATE9.) is a little hard to swallow, but informats are for reading data where it is somewhat easier to understand. Since you find a blank acceptable, what do you say to INPUT ('1JAN5 BCD',DATE9.) I find it hard to argue with 1 data _null_ ; 2 input chk $15. @1 dt datetime. ; 3 put chk= dt= datetime20. ; 4 cards ; chk=1JAN5,15:35:40 dt=01JAN2005:15:35:40 Perhaps one should have an agreed small set of allowable symbols that can trigger the end of a date, but it seems somewhat like a "mother may I" approach to the language which SAS has not traditionally adopted. I think there are arguments for a permissive language and for a very strict "you conform or else" language, but SAS has already made that decision. The handling of date, time and date/time informats is consistent with that decision, albeit not so well documented. Ian Whitlock ================= Date: Wed, 17 Jan 2007 08:33:10 -0500 Reply-To: Mike Rhoads <RHOADSM1(a)WESTAT.COM> Sender: "SAS(r) Discussion" From: Mike Rhoads <RHOADSM1(a)WESTAT.COM> Subject: Re: Safe way to test if a date is valid ? In-Reply-To: <011c01c739bc$69539a30$6600a8c0(a)IBMA9A4F058C42> Content-Type: text/plain; charset="us-ascii" Scott, This seems odd to me as well. INPUT ('123XYZ',6.) returns an error, even though within the specified width there is a valid numeric value prior to the invalid characters. Off the top of my head, I can't think of any good reason why INPUT ('1JAN5ABCD',DATE9.) should not return an error. The underlying situation appears to be the same. (Trailing blanks should be OK, of course.) Mike Rhoads Westat RhoadsM1(a)Westat.com -----Original Message----- From: owner-sas-l [mailto:owner-sas-l] On Behalf Of Scott Barry Sent: Tuesday, January 16, 2007 5:20 PM To: SAS-L ListServ Group Subject: Re: Safe way to test if a date is valid ? .... Also, I would consider the SAS DATE INFORMAT processing of alpha-characters in the year to be an serious defect without question. Hopefully SAS Institute can understand why, presuming someone reported the behavior? For many years I've used the DATA step technique with INPUT function and checking _ERROR_ to validate a user-specified date string as being correct. To see Dan's post is disquieting. Sincerely, Scott Barry SBBWorks, Inc.
From: Scott Barry on 19 Jan 2007 13:00 A follow-up on my SAS support issue/track regarding handling of datew. and mmddyyw. (input as character string) informat behavior with "invalid characters" in the year-portion, SAS Support pointed me to a few tech notes that appear to substantiate the "working as design" behavior, though the SAS 9 Language Dictionary is in conflict (without any warning otherwise) stating "must be in the form...two-digit or four-digit...year." When I asked about an alternate INFORMAT I can trust to validate an incoming date string, none was offered and I should consider using a character function (e.g., VERIFY) to ensure the year-portion is intact and as expected. How unfortunate. Also, I was given no justification as to why the DATEw. and MMDDYYw. INFORMATs behave differently when an invalid year-portion ends in a character (a possible condition with a masked date string?), as illustrated below where mmddyy10. considered the year-string "2xxx" to invalidate the date translation): 1 data _null_; 2 format dtvalue date9.; 3 dtvalue = input('01jan2xxx',date9.); 4 put _all_; 5 dtvalue = input('01/01/2xxx',mmddyy10.); 6 put _all_; 7 run; dtvalue=01JAN2002 _ERROR_=0 _N_=1 NOTE: Invalid argument to function INPUT at line 5 column 12. dtvalue=. _ERROR_=1 _N_=1 dtvalue=. _ERROR_=1 _N_=1 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 5:12 NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Oh, well....looks like "user beware". Sincerely, Scott Barry SBBWorks, Inc. ___________________________ SN-V8+-017774 DATEw. informat does not produce INVALID DATA message when alpha character follows year value -------------------------------------------------------------------------------- The DATEw. informat reads date values structured like "02jan06", which represents January 2, 2006. If there is an alpha character after the year value, the informat reads the value without producing an "invalid data" note in the SAS log. For example: data _null_; date=input('16jan5q',date7.); put date= mmddyy10.; run; produces: date=01/16/2005 The 'q' did not cause a problem when reading the value because the day value had been satisfied. The informat has behaved this way since SAS 82.4. The behavior is "by design", and not considered a bug. ***************************************************************** MMDDYY INFORMAT discussion - SAS 9 Language Reference Dictionary: Details The date values must be in the form mmddyy or mmddyyyy, where mm is an integer from 01 through 12 that represents the month. dd is an integer from 01 through 31 that represents the day of the month. yy or yyyy is a two-digit or four-digit integer that represents the year. You can separate the month, day, and year fields by blanks or by special characters. However, if you use delimiters, place them between all fields in the value. Blanks can also be placed before and after the date. Note: SAS interprets a two-digit year as belonging to the 100-year span that is defined by the YEARCUTOFF= system option. *****************************************************************
From: Ya Huang on 19 Jan 2007 13:38 >When I asked about an alternate INFORMAT I can trust to validate an incoming date string, none was >offered and I should consider using a character function (e.g., VERIFY) to ensure the year-portion >is intact and as expected. How unfortunate. Not even for v9?
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Which to use, SAS or SUDAAN? Next: PROC OPTMODEL: does it solve Mixed Integer Programming problems? |