Date: Thu, 29 Jul 2004 08:18:18 -0500
Reply-To: "Dunn, Toby" <Toby.Dunn@TEA.STATE.TX.US>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Dunn, Toby" <Toby.Dunn@TEA.STATE.TX.US>
Subject: Re: proc means
Content-Type: text/plain; charset="us-ascii"
Nathan,
If I am not mistaken proc means is used when producing a printed output
and proc summary when a data set is wanted. However, proc means does
have the option to output a data set as you demonstrate, that does not
mean that the data set created will be identical to the printed output.
From the v9 online help docs:
Output Data Set
PROC MEANS can create one or more output data sets. The procedure does
not print the output data set. Use PROC PRINT, PROC REPORT, or another
SAS reporting tool to display the output data set.
Note: By default the statistics in the output data set automatically
inherit the analysis variable's format and label. However, statistics
computed for N, NMISS, SUMWGT, USS, CSS, VAR, CV, T, PROBT, SKEWNESS,
and KURTOSIS do not inherit the analysis variable's format because this
format may be invalid for these statistics. Use the NOINHERIT option in
the OUTPUT statement to prevent the other statistics from inheriting the
format and label attributes.
The output data set can contain these variables:
the variables specified in the BY statement.
the variables specified in the ID statement.
the variables specified in the CLASS statement.
the variable _TYPE_ that contains information about the class variables.
By default _TYPE_ is a numeric variable. If you specify CHARTYPE in the
PROC statement, then _TYPE_ is a character variable. When you use more
than 32 class variables, _TYPE_ is automatically a character variable.
the variable _FREQ_ that contains the number of observations that a
given output level represents.
the variables requested in the OUTPUT statement that contain the output
statistics and extreme values.
the variable _STAT_ that contains the names of the default statistics if
you omit statistic keywords.
the variable _LEVEL_ if you specify the LEVEL option.
the variable _WAY_ if you specify the WAYS option.
The value of _TYPE_ indicates which combination of the class variables
PROC MEANS uses to compute the statistics. The character value of _TYPE_
is a series of zeros and ones, where each value of one indicates an
active class variable in the type. For example, with three class
variables, PROC MEANS represents type 1 as 001, type 5 as 101, and so
on.
Usually, the output data set contains one observation per level per
type. However, if you omit statistical keywords in the OUTPUT statement,
then the output data set contains five observations per level (six if
you specify a WEIGHT variable). Therefore, the total number of
observations in the output data set is equal to the sum of the levels
for all the types you request multiplied by 1, 5, or 6, whichever is
applicable.
If you omit the CLASS statement (_TYPE_= 0), then there is always
exactly one level of output per BY group. If you use a CLASS statement,
then the number of levels for each type that you request has an upper
bound equal to the number of observations in the input data set. By
default, PROC MEANS generates all possible types. In this case the total
number of levels for each BY group has an upper bound equal to
where is the number of class variables and is the number of
observations for the given BY group in the input data set and is 1, 5,
or 6.
PROC MEANS determines the actual number of levels for a given type from
the number of unique combinations of each active class variable. A
single level is composed of all input observations whose formatted class
values match.
The Effect of Class Variables on the OUTPUT Data Set shows the values of
_TYPE_ and the number of observations in the data set when you specify
one, two, and three class variables.
Try using proc summary it will get you what you want.
HTH
Toby Dunn
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Nathan Nissim Broudo
Sent: Wednesday, July 28, 2004 8:54 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: proc means
I'm getting different output through the print and noprint options of
proc
means:
When I run:
proc means data=mydata sum;
var v;
class OHED OIFF OSG1 OIPV OCON OAGI;
types OHED*OIFF*OSG1*OIPV*OCON*OAGI;
output sum= out=count;
run;
My Output window shows:
Analysis Variable : v
OHED OIFF OSG1 OIPV OCON OAGI N Obs Sum
---------------------------------------------------------------------
N N N N N N 48843 48843.00
Y N N 8593 8593.00
Y N 164 164.0000000
Y N N N N N 3 3.0000000
---------------------------------------------------------------------
While my dataset "count" is:
Obs OHED OIFF OSG1 OIPV OCON OAGI _TYPE_ _FREQ_
v
1 N N N N N N 63 48843
48843
2 N N N Y N N 63 8593
8593
3 N N N Y Y N 63 164
164
4 Y N N N N N 63 3
3
What's going on ?
This message,together with any attachments, is
intended only for the use of the individual or entity
to which it is addressed. It may contain information
that is confidential and prohibited from disclosure.
If you are not the intended recipient, you are hereby
notified that any dissemination or copying of this
message or any attachment is strictly prohibited. If
you have received this message in error, please notify
the original sender immediately by telephone or by
return e-mail and delete this message, along with any
attachments, from your computer. Thank you.