Prev: Multiple position - C++.net Technical Lead -- Lake Forest CA - 12 Months Contract
Next: Execute command stored in variable
From: hewei2004 on 7 Jul 2010 17:08 Hi, Suppose that I have the following file datasource brand quarter sale 1 toyota 1977q1 15 1 toyota 1977q2 25 1 toyota 1977q3 35 1 ford 1977q1 45 1 ford 1977q2 55 1 ford 1977q3 65 2 toyota 1977q1 15 2 toyota 1977q2 25 2 toyota 1977q3 99 3 toyota 1977q3 99 I want to create a file that merges the information in datasource=1 and datasource=2 and datasource=3, and create a variable called conflict_flag when datasource1 or datasource2 or datasource3 disagrees. note that datasource=1 agrees with datasource=2 agrees on 1977q1 and 1977q2, thus conflict_flag=0 note that datasource=1 agrees with datasource=2 disagrees on 1977q3, thus conflict_flag=1 The desired output is: conflict_flag brand quarter sale 0 toyota 1977q1 15 0 toyota 1977q2 25 1 toyota 1977q3 [mode of (35, 99, 99 ), which is 99 of course] 0 ford 1977q1 45 0 ford 1977q2 55 0 ford 1977q3 65
From: Arthur Tabachneck on 7 Jul 2010 17:44 Hewei, What if there are multiple modes? Art ------------- On Jul 7, 5:08 pm, hewei2004 <hewei2...(a)gmail.com> wrote: > Hi, > > Suppose that I have the following file > > datasource brand quarter sale > 1 toyota 1977q1 15 > 1 toyota 1977q2 25 > 1 toyota 1977q3 35 > 1 ford 1977q1 45 > 1 ford 1977q2 55 > 1 ford 1977q3 65 > 2 toyota 1977q1 15 > 2 toyota 1977q2 25 > 2 toyota 1977q3 99 > 3 toyota 1977q3 99 > > I want to create a file that merges the information in datasource=1 > and datasource=2 and datasource=3, and create a variable called > conflict_flag when datasource1 or datasource2 or datasource3 > disagrees. > > note that datasource=1 agrees with datasource=2 agrees on 1977q1 and > 1977q2, thus conflict_flag=0 > note that datasource=1 agrees with datasource=2 disagrees on 1977q3, > thus conflict_flag=1 > > The desired output is: > conflict_flag brand quarter sale > 0 toyota 1977q1 15 > 0 toyota 1977q2 25 > 1 toyota 1977q3 [mode of (35, 99, 99 ), which is 99 of > course] > 0 ford 1977q1 45 > 0 ford 1977q2 55 > 0 ford 1977q3 65
From: hewei2004 on 7 Jul 2010 20:00 On Jul 7, 5:44 pm, Arthur Tabachneck <art...(a)netscape.net> wrote: > Hewei, > > What if there are multiple modes? > > Art > ------------- Then use the smallest of the mode please. Thank you.
From: Arthur Tabachneck on 8 Jul 2010 10:41 Hewei, There is probably a much easier and straight forward way of doing what you want but, since no one else has responded, I'll offer the following possible solution: data have; input datasource brand $ quarter $ sale; cards; 1 toyota 1977q1 15 1 toyota 1977q2 25 1 toyota 1977q3 35 1 ford 1977q1 45 1 ford 1977q2 55 1 ford 1977q3 65 2 toyota 1977q1 15 2 toyota 1977q2 25 2 toyota 1977q3 99 3 toyota 1977q3 99 ; proc sort data=have; by brand quarter; run; proc freq data=have noprint; tables sale/out=mode ( rename=(sale=want_sale) drop=percent); by brand quarter; run; proc sort data=mode; by brand quarter descending count want_sale; run; proc sort data=mode nodupkey; by brand quarter; run; data have1 have2 have3; set have; if datasource eq 1 then output have1; else if datasource eq 2 then output have2; else output have3; run; data want; merge have1 (rename=(sale=sale1)) have2 (rename=(sale=sale2)) have3 (rename=(sale=sale3)); by brand quarter; run; data want; merge want mode; by brand quarter; run; data want (drop=aa: count i sale: datasource); array aa_sales(3); array a_sales(*) sale1-sale3; set want; do i=1 to dim(a_sales); aa_sales(i)=a_sales(i); end; call sortn (of aa_sales[*]); if mean(of sale:) ne aa_sales(dim(a_sales)) then conflict_flag=1; else conflict_flag=0; run; HTH, Art ----------- On Jul 7, 8:00 pm, hewei2004 <hewei2...(a)gmail.com> wrote: > On Jul 7, 5:44 pm, Arthur Tabachneck <art...(a)netscape.net> wrote: > > > Hewei, > > > What if there are multiple modes? > > > Art > > ------------- > > Then use the smallest of the mode please. > Thank you.
From: Richard A. DeVenezia on 12 Jul 2010 13:19
On Jul 7, 5:08 pm, hewei2004 <hewei2...(a)gmail.com> wrote: > Hi, > > Suppose that I have the following file > > datasource brand quarter sale > 1 toyota 1977q1 15 > 1 toyota 1977q2 25 > 1 toyota 1977q3 35 > 1 ford 1977q1 45 > 1 ford 1977q2 55 > 1 ford 1977q3 65 > 2 toyota 1977q1 15 > 2 toyota 1977q2 25 > 2 toyota 1977q3 99 > 3 toyota 1977q3 99 > > I want to create a file that merges the information in datasource=1 > and datasource=2 and datasource=3, and create a variable called > conflict_flag when datasource1 or datasource2 or datasource3 > disagrees. > > note that datasource=1 agrees with datasource=2 agrees on 1977q1 and > 1977q2, thus conflict_flag=0 > note that datasource=1 agrees with datasource=2 disagrees on 1977q3, > thus conflict_flag=1 > > The desired output is: > conflict_flag brand quarter sale > 0 toyota 1977q1 15 > 0 toyota 1977q2 25 > 1 toyota 1977q3 [mode of (35, 99, 99 ), which is 99 of > course] > 0 ford 1977q1 45 > 0 ford 1977q2 55 > 0 ford 1977q3 65 A self MERGE with partitioning WHERE= options can be used to perform the conflict discovery. Note that there is not a MODE function. This sample uses explicit logic coding to compute the MODE of three values. ---------- data have; input datasource brand $ quarter $ sale ; datalines; 1 toyota 1977q1 15 1 toyota 1977q2 25 1 toyota 1977q3 35 1 ford 1977q1 45 1 ford 1977q2 55 1 ford 1977q3 65 2 toyota 1977q1 15 2 toyota 1977q2 25 2 toyota 1977q3 99 3 toyota 1977q3 99 run; proc sort data=have; by datasource brand quarter; run; data compare; merge have (where=(datasource=1) rename=(sale=sale1) in = in1) have (where=(datasource=2) rename=(sale=sale2) in = in2) have (where=(datasource=3) rename=(sale=sale3) in = in3) ; by brand quarter; if min(of sale1-sale3) = max(of sale1-sale3) then sale = min(of sale1-sale3); else if in1 & in2 & in3 then do; * 3-item mode, explicit cases; if sale1 = sale2 then sale = sale1; else if sale1 = sale3 then sale = sale1; else if sale2 = sale3 then sale = sale2; else sale = min (of sale1-sale3); conflict_flag = 1; end; else do; sale = min (of sale1-sale3); conflict_flag = 1; end; drop datasource sale1-sale3; run; ---------- Richard A. DeVenezia http://www.devenezia.com |