|
Wei,
The phrase "must be sorted" does not mean PROC SORT must be used. It only
means the observations must be in order, because SAS checks to insure that
they are in order, because the correctness of the merge process depends on
the order.
Now for the question should you always use PROC SORT before a merge. I
think it depends on
1) your knowledge of the data
2) the consequences of being wrong
3) the cost of an unneeded sort
If I create the data I usually know what order it is in and do not use PROC
SORT unless needed. If the data comes from someone else, I usually ask or
test. If I find it in order then I assume it will be in order when I run.
Typically the cost for being wrong is less than minute and I can easily
afford it. If I were writing programs that would result in my being hauled
out of bed at 3:00AM when a merge failed, I would change my policy with
respect to data created by a source other than myself. If the cost of doing
an extra sort was several hundred dollars or hours of run time, I might
change that policy again.
I suggest you explain the situation to your colleague. If she is satisfied
with the consequences of her choice, then fine. If you are not satisfied
with the consequences then change your policy. If you are managing the
colleague and not satisfied with her choice then you can add your own
consequences and ask again. Just remember that your decisions and requests
also have consequences.
IanWhitlock@westat.com
-----Original Message-----
From: wei cheng [mailto:cheng_wei@HOTMAIL.COM]
Sent: Wednesday, October 10, 2001 1:46 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Merge w/o sort first
Hi there,
Thanks for all the thoughts. But let's RFTM:
SAS OnlineDoc, V8 SAS Language Reference: under MERGE statement
Match-Merging
Match-merging combines observations from two or more SAS data sets into a
single observation in a new data set according to the values of a common
variable. The number of observations in the new data set is the sum of the
largest number of observations in each BY group in all data sets. To perform
a match-merge, use a BY statement immediately after the MERGE statement. The
variables in the BY statement must be common to all data sets. Only one BY
statement can accompany each MERGE statement in a DATA step. ----- (Read
here) ----The data sets that are listed in the MERGE statement must be
sorted in order of the values of the variables that are listed in the BY
statement, or they must have an appropriate index. (Snip)
Let's forget the index here (suppose the data set has no index).Does the
"sorted in order of the values... " mean you don't need the data set be
marked sorted (Karsten M. Self: dataset has been ordered by a SORT, other
proc output, or a dataset with BY processing, and is so marked: marked
sorted.)? Of course I won't sort the data set again if it is marked sorted
for the BY variables before the MERGE.
From all the answers, it seems if the data set is in order already w/o
marked sorted ( Karsten M. Self: naturally collated: collated.), the MERGE
BY will works fine. Then what should I tell my colleague about what should
she do? Let she do as she always did for omitting the SORT process.She is a
junior level SAS programmer, but she said:"Since SAS runs correctly, why
should we sort it if it does not have a sorted mark but collated?"
Thanks again for your comments.
Wei Cheng
=================================================================
http://www.geocities.com/prochelp
INTERNET and Web Resources for SAS Programmers and Statisticians
=================================================================
>From: "Lambert, Bob" <Bob_Lambert@AFCC.COM>
>Reply-To: "Lambert, Bob" <Bob_Lambert@AFCC.COM>
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Re: Merge w/o sort first
>Date: Wed, 10 Oct 2001 12:06:50 -0500
>
>Tom Mendicino wrote:
>
><snip>
> > One of the main tasks of a programmer is to
> > try and anticipate "land mines" and develop routines which avoid them.
><snip>
>
>"Main tasks" are determined by the programmer's manager. Typically, these
>are, e.g., "Produce reports as required". Especially in SAS, the "how" is
>left to the programmer. The programmer of discussion here probably lacks
>your programming knowledge and background and is being successful with her
>skillset and is meeting the "minimum job standards". Without changes in
>her
>current environment (Skinner, not Gestalt), no changes in her (programming)
>behavior are expected. The responsiblity of environmental changes belongs
>to her manager. Unfortunately, many a manager of SAS programmers has no or
>little experience with SAS and haven't a clue as to what needs to be
>changed. As long as reports are being produced, everything is fine.
>
>So your statement is correct from a programmer's standpoint -- but perhaps
>not from a manager's.
>
>One more thing -- Somebody once told me, "Every system is perfectly
>designed for its output."
>
>hth
>
>Bob Lambert
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
|