Date: Fri, 19 Apr 1996 03:54:48 -0400
Reply-To: PHIL_G@DELTA.PRA-WW.COM
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Phil Gallagher <PHIL_G@DELTA.PRA-WW.COM>
Subject: Graphs of Empirical Distributions
Kouros Owzar asked how to graph empirical distributions.
I do this all the time, but I do not have a sample of tested code with me.
I will show the approach, but all should recognize that I have not tested
it, and I am bound to have made the usual keying errors.
* to graph the empirical distribution of GOODSTUF;
PROC FREQ DATA= yourdataset;
TABLES GOODSTUF / OUT=STATS;
* Output dataset will have the vars GOODSTUF COUNT PERCENT;
* See the PROC FREQ documentation;
RUN;
DATA CUMULATE(LABEL='Create the cumulative percents');
SET STATS;
RETAIN CUM_PCT 0;
IF (_N_ EQ 1) THEN CUM_PCT = 0;
* in principle this is unnecessary, but it hints at what to do
in more complicated situations, such as when using a BY-stmnt;
CUM_PCT = SUM(CUM_PCT,PERCENT);
LABEL CUM_PCT = 'Empirical Distribution (%)';
* a place to be careful - sooner or later one almost always invokes
FREQ with the missing option on the TABLES stmnt, and it is very
embarassing to have used a simple "+" and wind up with all missing
CUM_PCTs;
RUN;
PROC PLOT DATA=CUMULATE; * or GPLOT, of course;
PLOT CUM_PCT * GOODSTUF / options;
TITLE 'Empirical distribution of GOODSTUF';
RUN;
* this design may be extended in a straightforward way to use with
BY-stmnts, and the use of the OVERLAY option on the plot stmnt. Only a
little complicated. If you absolutely cannot do it yourself, query me.;
Philip Gallagher