LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 1 Jul 2009 11:34:56 -0500
Reply-To:     Joe Matise <snoopy369@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Joe Matise <snoopy369@GMAIL.COM>
Subject:      Re: Using SURVEYSELECT for random assignment?
In-Reply-To:  <941871A13165C2418EC144ACB212BDB0BEB433@dshsmxoly1504g.dshs.wa.lcl>
Content-Type: text/plain; charset=ISO-8859-1

Maybe i'm missing something, but I don't see how that would cause the possibility of two people with consecutive (but different) pts_over_50 to be switched in order?

-Joe

On Wed, Jul 1, 2009 at 11:08 AM, Nordlund, Dan (DSHS/RDA) < NordlDJ@dshs.wa.gov> wrote:

> > -----Original Message----- > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Joe > > Matise > > Sent: Tuesday, June 30, 2009 8:50 PM > > To: SAS-L@LISTSERV.UGA.EDU > > Subject: Re: Using SURVEYSELECT for random assignment? > > > > If it were me, with my account groups, I'd not use PROC SURVEYSELECT > > directly for this, because of rule 4. I am not a PROC SURVEYSELECT > expert, > > but reading over the sampling methods, none of them do what that > requests. > > Specifically, SYS takes every other observation, exactly; not one from > each > > pair randomly. It might work perfectly fine for what you want, but I > don't > > think it is precisely what 4. calls for. > > Couldn't one solve this problem by adding a uniform random number to the > dataset, adding the random variable to the end of the BY statement in proc > sort, and then use SURVEYSELECT > > Hope this is helpful, > > Dan > > > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA 98504-5204 > > > > > You could do this pretty trivially in a data step, of course. I'll > actually > > use PROC SURVEYSELECT at the end just for fun, though you could just as > > easily (and probably easier) do this in a data step. > > > > data docs ; > > input > > @1 doc $char4. > > @7 clinic $char9. > > @17 pts_over_50 > > @23 category $char6. > > ; > > datalines ; > > bob central 398 small > > mary central 400 small > > erin central 505 small > > john central 1000 medium > > lori central 1400 medium > > suzy central 2000 large > > raul central 2500 large > > jill central 3100 large > > roy central 5000 large > > joe central 8000 large > > jan central 8000 large > > jim central 8000 large > > stan central 8500 large > > jack central 8500 large > > carl eastside 391 small > > jane eastside 4000 large > > jess eastside 3999 large > > ; > > run ; > > > > > > proc sort data=docs; > > by clinic category DESCENDING pts_over_50 ; > > *taking advantage of large,medium,small being in correct order already > - > > if numeric, then DESCENDING category; > > run ; > > > > > > data docs_stratified / view=docs_stratified; > > set docs; > > by clinic category descending pts_over_50; > > if first.category then do; *or possibly first.clinic?; > > strata = 0; > > rowcount = 0; > > end; > > rowcount+1; > > strata + mod(rowcount,2); > > run; > > > > proc surveyselect data=docs_stratified out=tx_docs rate=0.5 seed=987654; > > strata clinic category strata; > > run; > > > > *one way to do it using datastep - this does NOT guarantee the extra > strata > > have a Tx selection, like the PROC SURVEYSELECT does; > > > > data tx_docs cn_docs; > > set docs_stratified; > > retain selected; > > by clinic category strata; > > if first.strata then do; > > if round(ranuni(987654),1) = 1 then selected = 1; > > else selected = 0; > > end; > > else selected = 1 - selected; > > if selected = 1 then output tx_docs; > > else output cn_docs; > > run; > > > > -Joe > > > > > > On Tue, Jun 30, 2009 at 7:56 PM, Pardee, Roy <pardee.r@ghc.org> wrote: > > > > > Hey All, > > > > > > I've got a set of about 100 physicians I need to randomize into > treatment & > > > control conditions, blocking on clinic and the # of patients in their > panel > > > over age 50. The instructions I have are: > > > > > > 1. Assign each doc a into large/medium/small panel size > > > category, on the basis of # of patients over age 50 > > > (pts_over_50). > > > > > > 2. Add some extra, pretend randomization slots to each > > > clinic/category for future docs to slot into if/when new > > > docs are added later. > > > > > > 3. Sort the docs by clinic & then descending pts_over_50. > > > > > > 4. Proceeding from the top of the list, randomize two > > > docs at a time--so for example, flip a coin and if the > > > result is heads, the first doc becomes tx & second is > > > control, if result is tails, then first doc is control & > > > second is tx. > > > > > > I'm wondering if SURVEYSELECT can do the basic randomization for me & > save > > > me the row-by-row programming? (I'm content to deal w/adding the extra > slots > > > in step 2). > > > > > > Looking at the sample in the help file entry for SURVEYSELECT, the > below > > > call seems promising. > > > > > > * =============================== ; > > > > > > data docs ; > > > input > > > @1 doc $char4. > > > @7 clinic $char9. > > > @17 pts_over_50 > > > @23 category $char6. > > > ; > > > datalines ; > > > bob central 398 small > > > mary central 400 small > > > erin central 505 small > > > john central 1000 medium > > > lori central 1400 medium > > > suzy central 2000 large > > > roy central 5000 large > > > carl eastside 391 small > > > jane eastside 4000 large > > > jess eastside 3999 large > > > ; > > > run ; > > > > > > proc sort ; > > > by clinic DESCENDING pts_over_50 ; > > > run ; > > > > > > proc print ; > > > run ; > > > > > > proc surveyselect data = docs method = sys rate = 0.5 seed = 987654 > out = > > > tx_docs ; > > > strata clinic category ; > > > control pts_over_50 ; > > > run ; > > > > > > proc print data = tx_docs ; > > > run ; > > > > > > * =============================== ; > > > > > > (So--anybody in the tx_docs dataset gets assigned to the treatment > > > condition, and the balance are controls.) > > > > > > Do any of you understand SURVEYSELECT well enough to say whether that > call > > > is equivalent to the instructions below? Is there a better way? Or > should > > > I just suck it up & try to literally carry out the instructions I have? > > > > > > I realize this is a long-shot, but figured I'd try... > > > > > > Many thanks! > > > > > > -Roy > > > > > > > > > GHC Confidentiality Statement > > > > > > This message and any attached files might contain confidential > information > > > protected by federal and state law. The information is intended only > for the > > > use of the individual(s) or entities originally named as addressees. > The > > > improper disclosure of such information may be subject to civil or > criminal > > > penalties. If this message reached you in error, please contact the > sender > > > and destroy this message. Disclosing, copying, forwarding, or > distributing > > > the information by unauthorized individuals or entities is strictly > > > prohibited by law. > > > >


Back to: Top of message | Previous page | Main SAS-L page