LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2002, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 1 May 2002 13:30:51 -0400
Reply-To:     "Dorfman, Paul" <Paul.Dorfman@BCBSFL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Dorfman, Paul" <Paul.Dorfman@BCBSFL.COM>
Subject:      Re: [DATA STEP]
Comments: To: "Chakravarthy, Venky" <Venky.Chakravarthy@PFIZER.COM>
Content-Type: text/plain; charset=iso-8859-1

Venky,

Indeed _iorc_ (a "genuiely" retained variable) can be used this way. In this case, though, placing the sum statement before the DOW (where it logically belongs) obviates the necessity to initialize _iorc_:

data w ; _iorc_ ++ 1 ; do until (last.col) ; set q ; by col notsorted ; colid = trim(col)||"_"||put(_iorc_,best.-L) ; output ; end ; run ;

Using an explicit file-reading loop renders all the retain issues irrelevant. Within such a loop, the only variables ever changed by a hidden instruction are those set to missing before a fresh by-group - and even that can be thought of as a result of a BY statement. Thus, with an explicit loop, _n_ can be used as well:

data w ; do _n_ = 1 by 1 ; do until (last.col) ; set q ; by col notsorted ; colid = trim(col)||"_"||put(_n_,best.-L) ; output ; end ; end ; run ;

Note that even though the outer loop appears to be infinite, it will stop as soon as the input from Q has been exhausted. Coding EOF is cleaner, but in this case, not necessary.

An interesting question is, how would one approach the problem if the records were not grouped and the user still wanted to retain their original order - without double-sorting? Then we should somehow memorize the keys we have already hit. In V9, the perfect tool for this is of course the hash table. However, because of the simplicity of the situation, it will hardly require more lines of code even under the current version. Let us, for the sake of simplicity, limit ourselves with the maximum of 100,000 distinct keys on the file:

data q ; input col $ ; cards ; one two one three two one two one three one run ;

%let h = 200003 ;

data w ( drop = j n ); array c (0:&h) $ _temporary_ ; array x (0:&h) _temporary_ ; set q ; do j = mod(input(col,pib6.), &h) until ( c(j) = col ) ; if j = &h then j = 0 ; if x(j) = . then do ; n ++ 1 ; x(j) = n ; c(j) = col ; end ; end ; colid = trim(col) || '_' || put(x(j), best.-l) ; run ;

Here is the output:

Obs col colid

1 one one_1 2 two two_2 3 one one_1 4 three three_3 5 two two_2 6 one one_1 7 two two_2 8 one one_1 9 three three_3 10 one one_1

Kind regards, ================ Paul M. Dorfman Jacksonville, FL ================

> -----Original Message----- > From: Chakravarthy, Venky [mailto:Venky.Chakravarthy@PFIZER.COM] > Sent: Wednesday, May 01, 2002 12:14 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: [DATA STEP] > > > Zubrowska, > > The underlying theme in all the replies is to use the > NOTSORTED option. Mine > uses the same but I am providing a continuation to > yesterday's theme on the > RETAIN. This solution merely demonstrates that an > automatically retained > variable (with the exception of _n_ and _error_) can be > initialized with a > value in a RETAIN and put to good use in the ubiquitous DOW: > > data q ; > input col $ ; > cards ; > one > one > one > two > two > two > one > one > three > three > run ; > > data w ; > retain _iorc_ 1 ; > do until (last.col) ; > set q ; > by col notsorted ; > colid = trim(col)||"_"||put(_iorc_,best.-L) ; > output ; > end ; > _iorc_ + 1 ; > run ; > > Kind Regards, > > Venky > #****************************************# > # E-mail: swovcc@hotmail.com # > # Phone: (734) 622-1963 # > #****************************************# > > > -----Original Message----- > From: zubrowka [mailto:zubrowka@gmx.net] > Sent: Wednesday, May 01, 2002 11:39 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: [DATA STEP] > > > Hi all, > > here is my small problem. > I have a table like that. > > obs col > 1 one > 2 one > 3 one > 4 two > 5 two > 6 two > 7 one > 8 one > 9 three > 10 three > > I want to obtain this > > > obs col colid > 1 one one_1 > 2 one one_1 > 3 one one _1 > 4 two two_2 > 5 two two_2 > 6 two two_2 > 7 one one_3 > 8 one one_3 > 9 three three_4 > 10 three three_4 > etc > > Obviously i cant do a proc sort by col because i will loose the order > of data, which is important. I didn't manage to find a solution. How > can i solve that. > > Thanxs in advance for replying. > > > Zubrowka >

Blue Cross Blue Shield of Florida, Inc., and its subsidiary and affiliate companies are not responsible for errors or omissions in this e-mail message. Any personal comments made in this e-mail do not reflect the views of Blue Cross Blue Shield of Florida, Inc.


Back to: Top of message | Previous page | Main SAS-L page