Date: Tue, 13 May 2003 17:15:56 -0400
Reply-To: Jonathan Siegel <jmsiegel@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jonathan Siegel <jmsiegel@YAHOO.COM>
Subject: Re: A little mystery of proc sql group by
Nice find!
I added some new lines to the sample data
f b 1
f c 1
f d 1
g d 1
I got
cat0 idv msn
----------------------------
1 a 3
1 b 1
1 c 0
1 d 1
cat0 idv msn
----------------------------
1 a 3
1 b 3
1 c 1
1 d 2
cat0 idv msn
----------------------------
1 a 3
1 b 3
1 c 1
1 d 2
So it looks like when the first GROUP BY variable is the dynamic constant,
msn is getting only those values of x that weren't in previous groups --
smells buggy.
Jonathan Siegel
On Tue, 13 May 2003 08:37:34 -0700, Huang, Ya <yhuang@AMYLIN.COM> wrote:
>Hi there,
>
>Would someone gives a good explanation why the following
>code generates different result?
>
>data xx;
>input x $ idv $ y;
>cards;
>c a 1
>e a 1
>k a 1
>c b 1
>e b 1
>;
>options formdlim=' ' nodate nonumber;
>title;
>
>proc sql;
>select 1 as cat0, idv, count(distinct x) as msn
>from xx
>group by cat0, idv
>;
>select 1 as cat0, idv, count(distinct x) as msn
>from xx
>group by idv, cat0
>;
>select y as cat0, idv, count(distinct x) as msn
>from xx
>group by cat0, idv
>;
>--------------
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 0
>
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 2
>
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 2
>
>Note that msn is different between the first
>select and the other two. The only difference
>between first select and second one is the order
>of cat0 and idv in the group by clause. I've never
>thought that the order of vars in a group by clause
>matters.
>
>Thanks
>
>Ya Huang
|