LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2007, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 17 Jun 2007 22:34:13 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Randomly picking a procedurecode for each doctor
In-Reply-To:  <7367b4e20706150829k500fbdeah5d957dccce2bb2ed@mail.gmail.com>
Content-Type: text/plain; format=flowed

datanull@GMAIL.COM sagely replied: > >I think, you want to use STRATA DOCID and N=1. If I understand correctly. > >data work.docs; > input docid:$6. procedurecode:$5. @@; > cards; >001111 21740 001111 21740 001111 21740 001111 21740 001111 21740 >001111 20680 001111 20680 001112 50240 001112 50240 001112 50240 >001112 52601 001112 52601 001112 55845 001113 48140 001113 48140 >001113 48140 001113 48140 001113 48140 001113 48150 001113 48150 >001113 47130 001113 47135 001114 53500 001114 53500 001114 53500 >001114 53450 001114 53450 001115 21045 001115 21045 001115 21045 >001115 21040 001115 21040 >;;;; > run; >proc surveyselect > seed=20062001 > data=docs > method=SRS > n=1 > out=surgerycase1; > strata docid; > run; >proc print; > run; > > >On 6/15/07, Annie Lee <hummingbird10111@hotmail.com> wrote: >>Hi, >> >>I have a data set with doctor's id (multiple records for each doctor) and >>procedurecode. >> >>This time, I would like to pick one randomly selected procedurecode for >>each (unique) doctorid in order to avoid any bias in selecting >>procedurecode. >> >>I tried doing this using proc surveyselect. >>one question I have regarding using proc surveyselect-- is there a way not >>to specify the sampsize in advance? >>In the future, I will have different number of records every quarter in >>this data set and I do not want to calculate the unique number of docid >>beforehand to assign the sampsize manually. >> >>I would appreciate any help. Thank you. -Eunice >> >> >> >>proc surveyselect data = surgerycase method = SRS rep = 1 >> sampsize = ?? out = surgerycase1; >>id _all_; >>run; >> >> >> >>Results I would like to have: >> >>docid procedurecode (it is randomly picked so it could be any >>procedurecode) >> >>001111 21740 >>001112 50240 >>001113 48140 >>001114 53500 >>001115 21045 >> >>data set: >> >>docid procedurecode >> >>001111 21740 >>001111 21740 >>001111 21740 >>001111 21740 >>001111 21740 >>001111 20680 >>001111 20680 >>001112 50240 >>001112 50240 >>001112 50240 >>001112 52601 >>001112 52601 >>001112 55845 >>001113 48140 >>001113 48140 >>001113 48140 >>001113 48140 >>001113 48140 >>001113 48150 >>001113 48150 >>001113 47130 >>001113 47135 >>001114 53500 >>001114 53500 >>001114 53500 >>001114 53450 >>001114 53450 >>001115 21045 >>001115 21045 >>001115 21045 >>001115 21040 >>001115 21040 >>

D0 has supplied the solution I had in mind.

I just want to add one thing. Because of his construction, the input data set is *already* sorted on DOCID, the stratum variable. If your data are not yet sorted (or indexed) on DOCID, you would need to sort/index on this variable first. Otherwise, the proc will complain. A lot.

When doing stratified sampling, PROC SURVEYSELECT uses the N= option to tell how many records to pick in each stratum, rather than for the whole data set. So N=1 is exactly what you asked for. It may *not* be what you should really be thinking about...

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Don’t miss your chance to WIN $10,000 and other great prizes from Microsoft Office Live http://clk.atdmt.com/MRT/go/aub0540003042mrt/direct/01/


Back to: Top of message | Previous page | Main SAS-L page