Date: Mon, 21 Aug 2006 03:53:32 -0700
Reply-To: Abhijat <ursabhijat@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Abhijat <ursabhijat@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: URGENT: PROC SORT TO PROC SQL CONVERSION
Content-Type: text/plain; charset="iso-8859-1"
Hi Chris,
I am trying out the options in the format that you just mentioned. In
fact, I had been trying it since I wrote the query on the group :-) !
No luck as yet .. :-(
Regards,
Abhijat.
ChrisW75 wrote:
> Try this:-
> proc sql;
> create table dset2 as
> select distinct *
> from dset1
> ;
> quit;
>
> It should be some permutation of that. I think last time I needed to
> dedupe in SQL I applied a unique key, took the maximum value of the
> unique key for each grouping, then in a second sql step joined the
> results of this back to the original table in an inner join.
> SQL is not good at deduping.
>
> ChrisW75
>
> Abhijat wrote:
> > Hi All,
> > Thanks a tonne for the responses.
> > Let me try clarify as much as I can so that I may get more customized
> > answers :-) !
> >
> > The code has to be in PROC SQL. I m using SAS 9.1.
> > Regarding Kenneth's question, I m afraid we have more variables that
> > mere A,B,C,D,E . We have about 20 extra variables!
> > I request you all SAS experts to keep your suggestions and answers
> > flowing in.
> >
> >
> > The question is very simple and is once again stated here:
> > *******************************************************************
> > Please suggest me a proc sql equivalent for the below mentioned proc
> > sort code in SAS 9.1.
> >
> > PROC SORT DATA = abc.xyz OUT = abc.pqr NODUPKEY;
> > BY A B C D E;
> > RUN;
> >
> > ~ xyz has about 25 variables.
> >
> > Thanks for the help.
> >
> > Regards,
> > Abhijat.
> >
> >
> >
> > ChrisW75 wrote:
> > > Yes, I'm aware that when Version 9 came out they included the DUPOUT
> > > option, which is a good option. My method though puts but the excluded
> > > dupliates AND the kept duplicates into the same table so that they can
> > > be examined side by side. This is often very useful when cleansing
> > > data.
> > >
> > >
> > >
> > > Gadi Ben-Avi wrote:
> > > > Chris,
> > > >
> > > > At a certain point an options was added to proc sort that lets you get the
> > > > 'deduped' observations in a separate dataset.
> > > >
> > > > The DUPOUT option will create a dataset with the deduped observations.
> > > >
> > > > Gadi
> > > > "ChrisW75" <9Squirrels@gmail.com> wrote in message
> > > > news:1156118064.027338.32280@75g2000cwc.googlegroups.com...
> > > > > Best practice for deduping is to not use either Proc SQL or Proc Sort I
> > > > > reckon. This is my method:-
> > > > >
> > > > > proc sort data=dset1;
> > > > > by A B C D E;
> > > > > run;
> > > > >
> > > > > data dset2
> > > > > dupes;
> > > > > set dset1;
> > > > > by A B C D E;
> > > > >
> > > > > if first.E then output dset2;
> > > > > if not(first.E and Last.E) then output dupes;
> > > > > run;
> > > > >
> > > > > This gives you a way to look at your duplicate records after the
> > > > > dedupe.
> > > > > ChrisW75
> > > > >
> > > > > Scott Bass wrote:
> > > > >> I agree with Ken, see this thread in the archive (read the entire
> > > > >> threaed):
> > > > >> http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0605A&L=sas-l&D=0&P=1162
> > > > >>
> > > > >> <kenneth_m_lin@sbcglobal.net> wrote in message
> > > > >> news:rm1Gg.9471$%j7.1256@newssvr29.news.prodigy.net...
> > > > >> > Unless your dataset only contains the fields A through E, it'd be
> > > > >> > difficult to perform de-duping in PROC SQL.
> > > > >> >
> > > > >> > "Abhijat" <ursabhijat@gmail.com> wrote in message
> > > > >> > news:1156068234.842470.228350@i42g2000cwa.googlegroups.com...
> > > > >> >> Hi All,
> > > > >> >> Please someone suggest me a proc sql equivalent for the below
> > > > >> >> mentioned
> > > > >> >> proc sort code.
> > > > >> >>
> > > > >> >>
> > > > >> >> PROC SORT DATA = abc.xyz OUT = abc.pqr NODUPKEY;
> > > > >> >> BY A B C D E;
> > > > >> >> RUN;
> > > > >> >>
> > > > >> >>
> > > > >> >> Many Thanks ,
> > > > >> >> Abhijat.
> > > > >> >>
> > > > >> >
> > > > >> >
> > > > >
|