From: Mike Zdeb on 19 Feb 2010 15:02 hi ... here's another idea ... it's similar to Ya Huang's idea so, first, if the the lines do not cross ... it's sort of easy if you understand how GPLOT uses the AREAS option ... here's an example from the on-line help (with a few observations eliminated) where the two lines do not cross ... ****************************************; data stocks; input year high low @@; datalines; 1960 685.47 568.05 1961 734.91 610.25 1962 726.01 535.76 1963 767.21 646.79 1964 891.71 768.08 1965 969.26 840.59 1966 995.15 744.32 1967 943.08 786.41 1968 985.21 825.13 1969 968.85 769.93 1970 842.00 631.16 1971 950.82 797.97 1972 1036.27 889.15 1973 1051.70 788.31 1974 891.66 577.60 1975 881.81 632.04 1976 1014.79 858.71 1977 999.75 800.85 1978 907.74 742.12 1979 897.61 796.67 1980 1000.17 759.13 1981 1024.05 824.01 1982 1070.55 776.92 1983 1287.20 1027.04 1984 1286.64 1086.57 1985 1553.10 1184.96 1986 1955.57 1502.29 1987 2722.42 1738.74 1988 2183.50 1879.14 1989 2791.41 2144.64 1990 2999.75 2365.10 ; run; goptions reset=all; axis1 order=(1960 to 1990 by 5) offset=(2,2) label=none major=(height=2) minor=(height=1); axis2 order=(0 to 4000 by 1000) offset=(0,0) label=none major=(height=2) minor=(height=1); pattern1 v=s c=white; pattern2 v=s c=graydd; symbol1 i=join c=black; symbol2 i=join c=black; symbol3 i=join c=red w=2; symbol4 i=join c=red w=2; proc gplot data=stocks; plot (low high low high) * year / overlay haxis=axis1 hminor=4 vaxis=axis2 vminor=1 caxis=black areas=2; run; quit; ****************************************; the PLOT statement uses both low ang high twice plus and AREAS=2 option GPLOT uses the 2nd PATTERN 1st since it draws the GRAY area (pattern color GRAYDD) for the y-variable HIGH first then it uses the 1st PATTERN which is WHITE for the y-variable LOW ... that white covers the gray area up to the level of the y-variable LOW as Ya pointed out, you can get the lines added by using the variables a second time (SYMBOLS 3 and 4 are used) for your data, the lines cross and you want the colors to change so, you can INVENT a new variable that will cover over shaded areas with WHITE (as done above), but its value has to be that of the lower value at each age value ... ****************************************; data foo; * use $char to preserve the leading space for " < 20" ... so it shows up on the LEFT; input age $char5. y1 y2; * new variable ... set y3 to the lower of the two values; y3 = y1*(y1 le y2) + y2*(y2 lt y1); ; datalines; <20 .00 .00 20-24 .01 .00 25-29 .03 .01 30-34 .04 .01 35-39 .05 .02 40-44 .06 .04 45-49 .08 .06 50-54 .08 .09 55-59 .09 .16 60-64 .15 .29 65-69 .10 .15 70-74 .08 .08 75-79 .09 .06 80-84 .07 .03 85+ .06 .02 ; run; goptions reset=all gunit=pct h=2; symbol1 i=j c=black; symbol2 i=j c=black; symbol3 i=j c=black; symbol4 i=j f=marker v='V' c=black; symbol5 i=j f=marker v='W' c=black; pattern1 v=s c=white; pattern2 v=s c=grayee; pattern3 v=s c=gray88; axis1 label=(a=90 'Y') order=0 to 0.3 by 0.05 minor=(n=4); axis2 label=('AGE GROUP') offset=(0,0)pct; * add some border white space; title1 ls=1; title2 a=90 ls=1; title3 a=-90 ls=1; footnote1 ls=1; proc gplot data=foo; plot (y3 y2 y1 y2 y1) * age /overlay areas=3 vaxis=axis1 haxis=axis2; run; quit; ****************************************; this time there are FIVE variables plotted ... there are 3 areas and remember that the 3rd variable will be shaded first (that's Y1 and the area up to that line is dark gray ... gray88) the 2nd variable is shaded next (that's Y2 and the area up to that line is light gray ... grayee, and it covers part of the area shaded with dark gray) the 1st variable is shaded last (that's Y#, the lowest value at each age level and the area is WHITE, so the white covers all the are up to that lower line) then Y2 and Y1 use the 4th and 5th symbol and add the actual lines for Y2 and Y1 (if you don't need the lines then just plot the first three variables in the list that all works, but SORT OF since the point at which the lines cross on the left side of the plot occurs BETWEEN age groups 45-49 and 50-54 ... when they cross on the right side, it's at a single point, at age group 70-74 if you look at the results you'll see a small area of light gray near the left crossing point that should not be there ... best you can do with this method or, you can 'adjust' your data and make Y1 = Y2 at either 45-49 or 50-54 and it'll look 'perfect' -- Mike Zdeb U(a)Albany School of Public Health One University Place Rensselaer, New York 12144-3456 P/518-402-6479 F/630-604-1475
From: Bill McKirgan on 19 Feb 2010 16:45 On Feb 19, 2:02 pm, ms...(a)albany.edu (Mike Zdeb) wrote: > hi ... here's another idea ... it's similar to Ya Huang's idea > > so, first, if the the lines do not cross ... it's sort of easy if you understand > how GPLOT uses the AREAS option ... > > here's an example from the on-line help (with a few observations eliminated) > where the two lines do not cross ... > > ****************************************; > data stocks; > input year high low @@; > datalines; > 1960 685.47 568.05 1961 734.91 610.25 1962 726.01 535.76 1963 767.21 646.79 > 1964 891.71 768.08 1965 969.26 840.59 1966 995.15 744.32 1967 943.08 786.41 > 1968 985.21 825.13 1969 968.85 769.93 1970 842.00 631.16 1971 950.82 797.97 > 1972 1036.27 889.15 1973 1051.70 788.31 1974 891.66 577.60 1975 881.81 632.04 > 1976 1014.79 858.71 1977 999.75 800.85 1978 907.74 742.12 1979 897.61 796.67 > 1980 1000.17 759.13 1981 1024.05 824.01 1982 1070.55 776.92 1983 1287.20 1027.04 > 1984 1286.64 1086.57 1985 1553.10 1184.96 1986 1955.57 1502.29 1987 2722.42 1738.74 > 1988 2183.50 1879.14 1989 2791.41 2144.64 1990 2999.75 2365.10 > ; > run; > > goptions reset=all; > > axis1 order=(1960 to 1990 by 5) offset=(2,2) > label=none > major=(height=2) > minor=(height=1); > > axis2 order=(0 to 4000 by 1000) offset=(0,0) > label=none > major=(height=2) > minor=(height=1); > > pattern1 v=s c=white; > pattern2 v=s c=graydd; > > symbol1 i=join c=black; > symbol2 i=join c=black; > symbol3 i=join c=red w=2; > symbol4 i=join c=red w=2; > > proc gplot data=stocks; > plot (low high low high) * year / overlay haxis=axis1 hminor=4 vaxis=axis2 vminor=1 caxis=black areas=2; > run; > quit; > ****************************************; > > the PLOT statement uses both low ang high twice plus and AREAS=2 option > GPLOT uses the 2nd PATTERN 1st since it draws the GRAY area (pattern color GRAYDD) for the y-variable HIGH first > then it uses the 1st PATTERN which is WHITE for the y-variable LOW ... that white covers the gray > area up to the level of the y-variable LOW > > as Ya pointed out, you can get the lines added by using the variables a second time (SYMBOLS 3 and 4 are used) > > for your data, the lines cross and you want the colors to change > > so, you can INVENT a new variable that will cover over shaded areas with WHITE (as done above), but its value > has to be that of the lower value at each age value ... > > ****************************************; > data foo; > * use $char to preserve the leading space for " < 20" ... so it shows up on the LEFT; > input age $char5. y1 y2; > * new variable ... set y3 to the lower of the two values; > y3 = y1*(y1 le y2) + y2*(y2 lt y1); > ; > datalines; > <20 .00 .00 > 20-24 .01 .00 > 25-29 .03 .01 > 30-34 .04 .01 > 35-39 .05 .02 > 40-44 .06 .04 > 45-49 .08 .06 > 50-54 .08 .09 > 55-59 .09 .16 > 60-64 .15 .29 > 65-69 .10 .15 > 70-74 .08 .08 > 75-79 .09 .06 > 80-84 .07 .03 > 85+ .06 .02 > ; > run; > > goptions reset=all gunit=pct h=2; > > symbol1 i=j c=black; > symbol2 i=j c=black; > symbol3 i=j c=black; > symbol4 i=j f=marker v='V' c=black; > symbol5 i=j f=marker v='W' c=black; > > pattern1 v=s c=white; > pattern2 v=s c=grayee; > pattern3 v=s c=gray88; > > axis1 label=(a=90 'Y') order=0 to 0.3 by 0.05 minor=(n=4); > axis2 label=('AGE GROUP') offset=(0,0)pct; > > * add some border white space; > title1 ls=1; > title2 a=90 ls=1; > title3 a=-90 ls=1; > footnote1 ls=1; > > proc gplot data=foo; > plot (y3 y2 y1 y2 y1) * age /overlay areas=3 vaxis=axis1 haxis=axis2; > run; > quit; > ****************************************; > > this time there are FIVE variables plotted ... there are 3 areas and remember that > > the 3rd variable will be shaded first (that's Y1 and the area up to that line is dark gray ... gray88) > the 2nd variable is shaded next (that's Y2 and the area up to that line is light gray ... grayee, > and it covers part of the area shaded with dark gray) > the 1st variable is shaded last (that's Y#, the lowest value at each age level and the area is WHITE, > so the white covers all the are up to that lower line) > > then Y2 and Y1 use the 4th and 5th symbol and add the actual lines for Y2 and Y1 (if you don't > need the lines then just plot the first three variables in the list > > that all works, but SORT OF since the point at which the lines cross on the left side of the plot > occurs BETWEEN age groups 45-49 and 50-54 ... when they cross on the right side, it's at a single > point, at age group 70-74 > > if you look at the results you'll see a small area of light gray near the left crossing point that > should not be there ... best you can do with this method > > or, you can 'adjust' your data and make Y1 = Y2 at either 45-49 or 50-54 and it'll look 'perfect' > > -- > Mike Zdeb > U(a)Albany School of Public Health > One University Place > Rensselaer, New York 12144-3456 > P/518-402-6479 F/630-604-1475 Mike, Thank you for another fine idea that builds on Ya Huang's suggestion. The information about where to find this in the documentation is just as helpful to me as the solution you crafted for this particular graph problem. Thanks again to all of you for your help, and please forgive my many spelling errors above. Bill McKirgan
From: Mike Zdeb on 19 Feb 2010 16:21 hi ... OK, on further thought (and more coffee) ... in the previous post I said ... "if you look at the results you'll see a small area of light gray near the left crossing point that should not be there ... best you can do with this method or, you can 'adjust' your data and make Y1 = Y2 at either 45-49 or 50-54 and it'll look 'perfect'" here's another idea ... not a general solution, but with these data you can figure out where the lines cross between age groups by creating a numeric x-variable and then use a format to show the age groups on the plot ... this will look "perfect" without changing the original data proc format; value age 1 = '<20' 2 = '20-24' 3 = '25-29' 4 = '30-34' 5 = '35-39' 6 = '40-44' 7 = '45-49' 8 = '50-54' 9 = '55-59' 10= '60-64' 11= '65-69' 12= '70-74' 13= '75-79' 14= '80-84' 15= '85+' ; run; data foo; input y1 y2 @@; * set y3 to the lower of the two values; y3 = y1*(y1 le y2) + y2*(y2 lt y1); x + 1; ; datalines; .00 .00 .01 .00 .03 .01 .04 .01 .05 .02 .06 .04 .08 .06 .08 .09 .09 .16 .15 .29 .10 .15 .08 .08 .09 .06 .07 .03 .06 .02 ; run; * where the lines cross between X=7 and X=8; data foo; if last then do; x = 7.655; y3 = 0.08; call missing (y1,y2); output; end; set foo end=last; output; run; proc sort data=foo; by x; run; goptions reset=all gunit=pct htext=2; symbol1 i=j c=black; symbol2 i=j c=black; symbol3 i=j c=black; symbol4 i=j f=marker v='V' c=black; symbol5 i=j f=marker v='W' c=black; pattern1 v=s c=white; pattern2 v=s c=grayee; pattern3 v=s c=gray88; axis1 label=(a=90 'Y') order=0 to 0.3 by 0.05 minor=(n=4); axis2 label=('AGE GROUP') offset=(0,0)pct minor=none; title1 ls=1; title2 a=90 ls=1; title3 a=-90 ls=1; footnote1 ls=1; proc gplot data=foo; plot (y3 y2 y1 y2 y1) * x /overlay areas=3 vaxis=axis1 haxis=axis2; format x age.; run; quit; -- Mike Zdeb U(a)Albany School of Public Health One University Place Rensselaer, New York 12144-3456 P/518-402-6479 F/630-604-1475 > hi ... here's another idea ... it's similar to Ya Huang's idea > > for your data, the lines cross and you want the colors to change > > so, you can INVENT a new variable that will cover over shaded areas with WHITE (as done above), but its value > has to be that of the lower value at each age value ... > > ****************************************; > data foo; > * use $char to preserve the leading space for " < 20" ... so it shows up on the LEFT; > input age $char5. y1 y2; > * new variable ... set y3 to the lower of the two values; > y3 = y1*(y1 le y2) + y2*(y2 lt y1); > ; > datalines; > <20 .00 .00 > 20-24 .01 .00 > 25-29 .03 .01 > 30-34 .04 .01 > 35-39 .05 .02 > 40-44 .06 .04 > 45-49 .08 .06 > 50-54 .08 .09 > 55-59 .09 .16 > 60-64 .15 .29 > 65-69 .10 .15 > 70-74 .08 .08 > 75-79 .09 .06 > 80-84 .07 .03 > 85+ .06 .02 > ; > run; > > goptions reset=all gunit=pct h=2; > > symbol1 i=j c=black; > symbol2 i=j c=black; > symbol3 i=j c=black; > symbol4 i=j f=marker v='V' c=black; > symbol5 i=j f=marker v='W' c=black; > > pattern1 v=s c=white; > pattern2 v=s c=grayee; > pattern3 v=s c=gray88; > > axis1 label=(a=90 'Y') order=0 to 0.3 by 0.05 minor=(n=4); > axis2 label=('AGE GROUP') offset=(0,0)pct; > > * add some border white space; > title1 ls=1; > title2 a=90 ls=1; > title3 a=-90 ls=1; > footnote1 ls=1; > > proc gplot data=foo; > plot (y3 y2 y1 y2 y1) * age /overlay areas=3 vaxis=axis1 haxis=axis2; > run; > quit; > ****************************************; > > this time there are FIVE variables plotted ... there are 3 areas and remember that > > the 3rd variable will be shaded first (that's Y1 and the area up to that line is dark gray ... gray88) > the 2nd variable is shaded next (that's Y2 and the area up to that line is light gray ... grayee, > and it covers part of the area shaded with dark gray) > the 1st variable is shaded last (that's Y#, the lowest value at each age level and the area is WHITE, > so the white covers all the are up to that lower line) > > then Y2 and Y1 use the 4th and 5th symbol and add the actual lines for Y2 and Y1 (if you don't > need the lines then just plot the first three variables in the list > > that all works, but SORT OF since the point at which the lines cross on the left side of the plot > occurs BETWEEN age groups 45-49 and 50-54 ... when they cross on the right side, it's at a single > point, at age group 70-74 > > if you look at the results you'll see a small area of light gray near the left crossing point that > should not be there ... best you can do with this method > > or, you can 'adjust' your data and make Y1 = Y2 at either 45-49 or 50-54 and it'll look 'perfect' > > -- > Mike Zdeb > U(a)Albany School of Public Health > One University Place > Rensselaer, New York 12144-3456 > P/518-402-6479 F/630-604-1475 >
From: Bill McKirgan on 25 Feb 2010 11:43 I just wanted to give you all an update on my graph problem and solution. Below, is the code I used to make the kind of graph my colleague 'TK' was asking for last week. He was using excel, and was asked to fill in the area between two lines and found that while this sounded easy-enough to do, it was in fact impossible with Excel 2007. The closest we came was by overlaying area plots and adding borders to the areas and making them transparent. I took a stab at the problem with SGplot and with my limited SAS graph knowlege I ran into the same problem (ie, there's no simple solution), and that's when I posted-up. Thanks to all of you you expressed interest and offered ideas, examples and spot-on solutions both on and off list. I am in your debt and will look through these ideas later on as a way to learn more about things like annotation datasets, and hash objects for making more data-driven custom solutions. In the end I used a solution provided by Mike Zdeb that is similar to what Ya Huang posted. Another excellent candidate solution was Data _NULL_'s which used hash objects to automate the part where fill areas between the intersecting lines are trimmed. _NULL_, that stuff was way over my head, but I will keep studying it because I want to learn more techniques for making dynamic programs. I use list processing and macro solutions in my regular data management work, but do not yet understand the hash, and appreciate your example as it will probably be what helps light the hash-object bulb in my head (so to speak). I'm pretty sure my colleague will be very happy with the result, and may be astounded to learn that it was accomplished by the generous support of the SAS-L community. Thanks!! Bill McKirgan Here's that code... /* _graph_help__mike_zdeb_.sas Use this to make a graph for TK. This began as an excel file/graph with a request to fill the areas between the two plotted lines. Not easy/possible to do in excel Bill McKirgan tried to do in SAS sgplot and then asked for help on the SAS Listerv: http://groups.google.com/group/comp.soft-sys.sas/browse_thread/thread/c08750954d3fea12?hl=en&pli=1 Many good ideas were suggested, and Mike Zdeb's solution was the best/closest to meeting the goal TK described. Mike's code was downloaded from SAS-L and into this program which has some minor modifications to improve colors, add titles and graph legend information. Bill McKirgan 2/25/2010 */ /* point to directory to save output */ %let whatpath=mypath; /* Mike uses formatted variable instead of character data */ proc format; value age 1 = '<20' 2 = '20-24' 3 = '25-29' 4 = '30-34' 5 = '35-39' 6 = '40-44' 7 = '45-49' 8 = '50-54' 9 = '55-59' 10= '60-64' 11= '65-69' 12= '70-74' 13= '75-79' 14= '80-84' 15= '85+' ; run; /* Admin Data from TK's chart (deidentified prior to posting to SAS- L)*/ data foo; input y1 y2 @@; * set y3 to the lower of the two values; y3 = y1*(y1 le y2) + y2*(y2 lt y1); x + 1; datalines; ..00 .00 ..01 .00 ..03 .01 ..04 .01 ..05 .02 ..06 .04 ..08 .06 ..08 .09 ..09 .16 ..15 .29 ..10 .15 ..08 .08 ..09 .06 ..07 .03 ..06 .02 ; run; * Mike's HARDCODE where the lines cross between X=7 and X=8; /* This was brilliant, and another similar post from Data _Null_ involved a dynamic solution to filling the areas that involved many lines of code and the use of HASH objects That too was very close to the desired final appearance; however, the hashes were way beyond my comprehension but I will study the example because I like data-driven solutions */ data foo; if last then do; x = 7.655; y3 = 0.08; call missing (y1,y2); output; end; set foo end=last; output; run; proc sort data=foo; by x; run; goptions reset=all gunit=pct htext=2; /* McKirgan edits to fine-tune graph output (suppress lines for clean legend) Add color to lines and fill patterns */ symbol1 i=j c=white; /* keep lines white to suppress in legend */ symbol2 i=j c=white; /* keep lines white to suppress in legend */ symbol3 i=j c=white; /* keep lines white to suppress in legend */ symbol4 i=j f=marker v='V' c=red; /* line color */ symbol5 i=j f=marker v='W' c=blue; pattern1 v=s c=white; pattern2 v=s c=lightblue ; pattern3 v=s c=lightgreen ; axis1 label=(a=90 'PERCENTAGE IN COHORT') order=0 to 0.3 by 0.05 minor=(n=4) ; axis2 label=('AGE GROUP') offset=(0,0)pct minor=none ; legend1 label=none value=('' '' '' 'MyHVs' 'Total Vs') down=5 mode=protect position=inside ; title1 ls=1 'MHV vs Total V Age Distribution' ; *title2 a=90 ls=1 'title 2' ; *title3 a=-90 ls=1 'title 3' ; *footnote1 ls=1 'footnote 1' ; /* pdf is okay, but simply right clicking graph in SAS and exporting as JPG is faster and has a better result (an image that can be copy/pasted into an MS-Office Word/Powerpoint file */ /* suppressing output to pdf after initial test */ *ods pdf file="&WHATPATH.\MHv_graph.pdf"; /* JUST EXPORT THE GRAPH FROM WITHIN SAS SESSION */ proc gplot data=foo; plot (y3 y2 y1 y2 y1) * x /overlay areas=3 vaxis=axis1 haxis=axis2 legend = legend1 ; format x age.; run; quit; *ods pdf close; run; quit;
First
|
Prev
|
Pages: 1 2 3 Prev: put the space in data step Next: Interesting post about SAS and the Pharmaceutical Industry on |