From: John Lu on 4 Jun 2010 02:47 Assuming we know there are multiple transition of id as in the table "transition". How to merge the records to create a complete history? There are large number of iteration in the origianl data, for example, a id can go through 60+ transitions. So I am looking for a solutoin that is more efficient than left joins. data transition; input id_prev id maxtermnumber d_prev_obs yymmdd10.; format d_prev_obs date9.; datalines; 1 2 1 2001-01-01 6 9 6 2006-01-01 2 3 2 2002-01-01 3 4 3 2003-01-01 4 5 4 2004-01-01 5 . 3 2005-01-01 9 . 5 2009-01-01 10 . 1 2010-01-01 ; run; /*what we want to see*/ data history; input id_first id totaltermnumber d_prev_obs yymmdd10.; format d_prev_obs date9.; datalines; 1 1 1 2001-01-01 1 2 3 2001-01-01 1 3 6 2001-01-01 1 4 10 2001-01-01 1 5 13 2001-01-01 6 6 6 2006-01-01 6 9 11 2006-01-01 10 10 1 2010-01-01 ; run; thanks and regards
From: Patrick on 4 Jun 2010 08:04 Hi Not sure if I understand your question so just a few thoughts here. It looks to me as if this is about keeping history in a table. What SAS in its own solutions does in such a case (i.e. the SAS detailed data store for banking) is using a composite key in the following way: - Retained_Key (that would be your ID), valid_from_dttm, valid_to_dttm. The most current record has a value for valid_to_dttm in the far future. When you add a new record you expire the current record by setting the valid_to_dttm to the value of the valid_from_dttm - and you add the new record with a valid_to_dttm with a value in the far future. You can also use an additional variable like current_record_flag (which is 1 for the current record) for an easier way to filter for the current record. This way of loading/storing data in a table is called SCD type 2 (you'll find quite a bit of theory using this search term). SAS provides special ready made transformations for this for the ones using SAS DI Studio. If you don't have DI Studio you would have to code it yourself (it's quite a bit of coding as SCD2 loading is not that simple). In general: I believe SCD2 is the way you should store and maintain your data if you want to keep history in your table. The way you're doing it is in a way an attempt to have SCD2 but with the flaw that you have to modify all records which belong to an id in case that you want to add a new record with the same id. HTH Patrick
|
Pages: 1 Prev: Comparison of regression slopes and intercepts Next: Try Free Project Management Software |