| Date: | Thu, 15 May 2003 11:16:42 -0400 |
| Reply-To: | Ian Whitlock <WHITLOI1@WESTAT.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Ian Whitlock <WHITLOI1@WESTAT.COM> |
| Subject: | Re: A little mystery of proc sql group by |
|
| Content-Type: | text/plain |
|---|
Ya,
Here is another curios fix to the first select
proc sql;
select case
when 0 then 0
else 1
end as cat0
, idv
, count(distinct x) as msn
from xx
group by cat0, idv
;
IanWhitlock@westat.com
-----Original Message-----
From: Huang, Ya [mailto:yhuang@AMYLIN.COM]
Sent: Tuesday, May 13, 2003 5:28 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: A little mystery of proc sql group by
Thanks Jonathan and Prasad.
I'm leaning to think it is "buggy" too, in that if I assign cat0 as a
character var, the number seems OK:
proc sql;
select '1' as cat0, idv, count(distinct x) as msn
from xx
group by cat0, idv
;
cat0 idv msn
ffffffffffffffffffffffff
1 a 3
1 b 2
Weird ?!
Ya
-----Original Message-----
From: Jonathan Siegel [mailto:jmsiegel@YAHOO.COM]
Sent: Tuesday, May 13, 2003 2:16 PM
To: SAS-L@LISTSERV.UGA.EDU; Huang, Ya
Subject: Re: A little mystery of proc sql group by
Nice find!
I added some new lines to the sample data
f b 1
f c 1
f d 1
g d 1
I got
cat0 idv msn
----------------------------
1 a 3
1 b 1
1 c 0
1 d 1
cat0 idv msn
----------------------------
1 a 3
1 b 3
1 c 1
1 d 2
cat0 idv msn
----------------------------
1 a 3
1 b 3
1 c 1
1 d 2
So it looks like when the first GROUP BY variable is the dynamic constant,
msn is getting only those values of x that weren't in previous groups --
smells buggy.
Jonathan Siegel
On Tue, 13 May 2003 08:37:34 -0700, Huang, Ya <yhuang@AMYLIN.COM> wrote:
>Hi there,
>
>Would someone gives a good explanation why the following
>code generates different result?
>
>data xx;
>input x $ idv $ y;
>cards;
>c a 1
>e a 1
>k a 1
>c b 1
>e b 1
>;
>options formdlim=' ' nodate nonumber;
>title;
>
>proc sql;
>select 1 as cat0, idv, count(distinct x) as msn
>from xx
>group by cat0, idv
>;
>select 1 as cat0, idv, count(distinct x) as msn
>from xx
>group by idv, cat0
>;
>select y as cat0, idv, count(distinct x) as msn
>from xx
>group by cat0, idv
>;
>--------------
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 0
>
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 2
>
> cat0 idv msn
>ffffffffffffffffffffffffffff
> 1 a 3
> 1 b 2
>
>Note that msn is different between the first
>select and the other two. The only difference
>between first select and second one is the order
>of cat0 and idv in the group by clause. I've never
>thought that the order of vars in a group by clause
>matters.
>
>Thanks
>
>Ya Huang
|