LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 7 Apr 2009 05:54:26 -0700
Reply-To:     Jack Hamilton <jfh@STANFORDALUMNI.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jack Hamilton <jfh@STANFORDALUMNI.ORG>
Subject:      Re: proc sql- distinct, macros, etc.
Comments: To: "Mark S." <marksmolkin@GMAIL.COM>
Content-Type: text/plain; charset="iso-8859-1"

The answer to 1) and 2) is that order isn't guaranteed unless you use an ORDER BY clause.

I would use DISTINCT in 3), because I'm not sure SQL would add it for me.

What problem do you think will occur if the 'order by' and 'distinct' fields are not the same?

-- Jack Hamilton

-----Original Message----- From: Mark S. <marksmolkin@GMAIL.COM> Sent: Monday, April 06, 2009 10:30 PM To: SAS-L@LISTSERV.UGA.EDU Subject: [SAS-L] proc sql- distinct, macros, etc.


I am hoping to get some clarification regarding proc sql behavior for which the manuals ain't cuttin' it. Any advice on helpful references would be greatly apprecated too. Here are some questions:

1) If you use 'select distinct [vars]', does the outputted result automatically get sorted by the values of columns in [vars]?

2) When creating macro variables in proc sql using the 'separated by' clause, is the order of the individual values in the incoming dataset maintained when 'strung out' to form the values of the macro variable. For example, if you have a dataset called 'mydata' sorted on the ID field,

ID Name 1 Joe 2 Goober 3 Zeke

and you submit,

proc sql; select name into :macrovar_name separated by ' ' from mydata; quit;

will proc sql always respect the order of the data on input to give me 'Joe Goober Zeke' as the value of the macro variable, or, will it possibly change it? For instance, might it instead reorder the values based on sorted values of the 'name' field (ie. 'Goober Joe Zeke'), or, obey an 'order by' clause that perhaps tells it to sort by some other field not shown? I guess I am not to sure about timing in proc sql, especially with macro variable creation and the 'order by' clause.

3) My spidey sense tells me that this 'distinct' statement might be more trouble than it is worth for somebody like me without much sql experience outside of SAS. Is it a no-no to use distinct statements in more complicated expressions like below (just a nonsense example), where the result of a 'distinct' statement is used as part of a join?

proc sql; select A.ID,..... from mydata1 as A INNER JOIN (select distinct ID,... from mydata2) as B on A.ID=B.ID; quit;

Also, what about a similar example with something like 'order by A.ID' added to it? I know that 'order by' can be a problem with 'distinct' in a simple sql expression if the 'order by' and 'distinct' fields are not the same. Is a 'distinct' field/'order by' field discrepancy a problem also when the 'order by' clause is one-step removed from the 'distinct' statement? Sorry about the brutal misuse of terminology, I don't know what I should be calling these things!....

Any help would be greatly appreciated! Thanks!


Back to: Top of message | Previous page | Main SAS-L page