From: TripleK on 5 Aug 2010 13:52 All, Can we merge a dataset with a veiw and create a permanent dataset? Please see the code below . Any any help would be appreciated. Thanks Kiran. rsubmit; data dispositions_2010; set mad_p.dispositions_2010(where=(le_leadid ne . and CI_CONTACTTYPE_DESC ne 'Account Review' and CHI_STORE_NUM in (9550,9551) )); run; proc sql; create view inserts_2010 as select le_leadid, le_leadinsertdate, bu_bucketdesc, le_housingstat, RE_TIER_DESC, AUTO_TIER_DESC, LE_BRANCHNUM ,le_source from mad_P.inserts_2010 order by le_leadid, le_leadinsertdate; select * from inserts_2010; quit; data save.dispositions; merge dispositions_2010(in=a) inserts_2010(in=b); by le_leadid le_leadinsertdate; if a; run; endrsubmit;
From: Patrick on 5 Aug 2010 16:46 Yes, you can. I do it all the time. Your code looks right to me. Is something not working? In case "dispositions_2010" isn't needed later on as work table then you could also use a view here - and merge the 2 views.
From: TripleK on 6 Aug 2010 11:35 On Aug 5, 3:46 pm, Patrick <patrick.mat...(a)gmx.ch> wrote: > Yes, you can. I do it all the time. > > Your code looks right to me. Is something not working? > > In case "dispositions_2010" isn't needed later on as work table then > you could also use a view here - and merge the 2 views. Patrick, Thanks for your response. The last step, writing into permanate dataset is taking more time (not completeing). I submited the job at 3 hours earlier still not completed. But when I am a merging with datasets it used to take 2 hours. The dispostions have 40k records with 50 variables and inserts have 50 million records with 7 variables. So merging with views take more time? -Kiran.
From: Patrick on 8 Aug 2010 04:09 Hi Kiran A SQL view is in a way nothing else than some encapsulated SQL code which gets executed when you call the view. The good thing about a view is that it's executed in memory which reduces disk I\O - so it should actually perform better than first writing a table to disk and then reading this table again for a SAS merge. The problem you could have: As these are 50M observations you might not have enough memory to execute the view - and then the memory has to write and read it's content to disk (paging). This can result in a performance which is worse than creating a table first. Just some thoughts: - In case these 2 tables are in a DB then you might be better off to formulate this join as a SQL left join (else the whole 50M rows get first loaded into SAS). If only the "inserts" table is in a DB then upload first your small table into the DB and then do the join there. - The resource intensive step is sorting these 50M records. It might be worth to create an index over your join keys (on both tables) and then use this index for joining (there are examples in the SAS doc of how to do this). - Another way could be (if both tables in SAS): In a data step you create a hash table out of the small table "dispositions_2010", you iterate through the big table "inserts" as is (no pre-sorting, set inserts(keep=<keys><additional wanted vars>);), you only write the records to the output table where there was a key match with the lookup table (rc=h.find(); output if rc=0). In case you're not 100% sure that there will be for all records in the hash table a matching record in the inserts table then you might have to re-join the original table "dispositions_2010" with the result table (which should be in the same structure like "inserts" plus the addtional variables and may be some missing records -> a SQL <dispositions_2010> union corr <result table> should fix this). HTH Patrick
From: Patrick on 8 Aug 2010 04:20 For the hash aproach: As there are 50 vars in "dispositions_2010" I actually think it would be better to only load the keys of "dispositions_2010" into a hash, and then use rc=h.check(). The result table could then be used in a SQL join with the original "dispositions_2010" -> dispositions_2010 as l left join <result table> as r on l.le_leadid= r.le_leadid and l.le_leadinsertdate=r.le_leadinsertdate
|
Pages: 1 Prev: best way to pass in list of values as parameter for macro Next: LDAPS Authentication |