Date: Tue, 7 Apr 2009 05:00:40 -0700
Reply-To: Peter <crawfordsoftware@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter <crawfordsoftware@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: proc sql- distinct, macros, etc.
Content-Type: text/plain; charset=ISO-8859-1
On Apr 7, 6:30 am, marksmol...@GMAIL.COM ("Mark S.") wrote:
> Hi,
>
> I am hoping to get some clarification regarding proc sql behavior for
> which the manuals ain't cuttin' it. Any advice on helpful references would
> be greatly apprecated too. Here are some questions:
>
> 1) If you use 'select distinct [vars]', does the outputted result
> automatically get sorted by the values of columns in [vars]?
>
> 2) When creating macro variables in proc sql using the 'separated by'
> clause, is the order of the individual values in the incoming dataset
> maintained when 'strung out' to form the values of the macro variable. For
> example, if you have a dataset called 'mydata' sorted on the ID field,
>
> ID Name
> 1 Joe
> 2 Goober
> 3 Zeke
>
> and you submit,
>
> proc sql;
> select name into :macrovar_name separated by ' ' from mydata;
> quit;
>
> will proc sql always respect the order of the data on input to give me
> 'Joe Goober Zeke' as the value of the macro variable, or, will it possibly
> change it? For instance, might it instead reorder the values based on
> sorted values of the 'name' field (ie. 'Goober Joe Zeke'), or, obey
> an 'order by' clause that perhaps tells it to sort by some other field not
> shown? I guess I am not to sure about timing in proc sql, especially with
> macro variable creation and the 'order by' clause.
>
> 3) My spidey sense tells me that this 'distinct' statement might be more
> trouble than it is worth for somebody like me without much sql experience
> outside of SAS. Is it a no-no to use distinct statements in more
> complicated expressions like below (just a nonsense example), where the
> result of a 'distinct' statement is used as part of a join?
>
> proc sql;
> select A.ID,.....
> from mydata1 as A INNER JOIN (select distinct ID,... from mydata2) as B on
> A.ID=B.ID;
> quit;
>
> Also, what about a similar example with something like 'order by A.ID'
> added to it? I know that 'order by' can be a problem with 'distinct' in a
> simple sql expression if the 'order by' and 'distinct' fields are not the
> same. Is a 'distinct' field/'order by' field discrepancy a problem also
> when the 'order by' clause is one-step removed from the 'distinct'
> statement? Sorry about the brutal misuse of terminology, I don't know what
> I should be calling these things!....
>
> Any help would be greatly appreciated! Thanks!
>
> Mark
try the _METHOD option on the proc sql statement.
It should indicate whether a sort is being performed.
but I wouldn't rely on it.
If you need the sort, use a sorted by clause. That
should be optimised if it isn't needed. _METHOD
should clarify if the group-by makes any difference.
PeterC
|