Date: Wed, 1 Jul 2009 11:34:56 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: Using SURVEYSELECT for random assignment?
In-Reply-To: <941871A13165C2418EC144ACB212BDB0BEB433@dshsmxoly1504g.dshs.wa.lcl>
Content-Type: text/plain; charset=ISO-8859-1
Maybe i'm missing something, but I don't see how that would cause the
possibility of two people with consecutive (but different) pts_over_50 to be
switched in order?
-Joe
On Wed, Jul 1, 2009 at 11:08 AM, Nordlund, Dan (DSHS/RDA) <
NordlDJ@dshs.wa.gov> wrote:
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Joe
> > Matise
> > Sent: Tuesday, June 30, 2009 8:50 PM
> > To: SAS-L@LISTSERV.UGA.EDU
> > Subject: Re: Using SURVEYSELECT for random assignment?
> >
> > If it were me, with my account groups, I'd not use PROC SURVEYSELECT
> > directly for this, because of rule 4. I am not a PROC SURVEYSELECT
> expert,
> > but reading over the sampling methods, none of them do what that
> requests.
> > Specifically, SYS takes every other observation, exactly; not one from
> each
> > pair randomly. It might work perfectly fine for what you want, but I
> don't
> > think it is precisely what 4. calls for.
>
> Couldn't one solve this problem by adding a uniform random number to the
> dataset, adding the random variable to the end of the BY statement in proc
> sort, and then use SURVEYSELECT
>
> Hope this is helpful,
>
> Dan
>
>
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA 98504-5204
>
> >
> > You could do this pretty trivially in a data step, of course. I'll
> actually
> > use PROC SURVEYSELECT at the end just for fun, though you could just as
> > easily (and probably easier) do this in a data step.
> >
> > data docs ;
> > input
> > @1 doc $char4.
> > @7 clinic $char9.
> > @17 pts_over_50
> > @23 category $char6.
> > ;
> > datalines ;
> > bob central 398 small
> > mary central 400 small
> > erin central 505 small
> > john central 1000 medium
> > lori central 1400 medium
> > suzy central 2000 large
> > raul central 2500 large
> > jill central 3100 large
> > roy central 5000 large
> > joe central 8000 large
> > jan central 8000 large
> > jim central 8000 large
> > stan central 8500 large
> > jack central 8500 large
> > carl eastside 391 small
> > jane eastside 4000 large
> > jess eastside 3999 large
> > ;
> > run ;
> >
> >
> > proc sort data=docs;
> > by clinic category DESCENDING pts_over_50 ;
> > *taking advantage of large,medium,small being in correct order already
> -
> > if numeric, then DESCENDING category;
> > run ;
> >
> >
> > data docs_stratified / view=docs_stratified;
> > set docs;
> > by clinic category descending pts_over_50;
> > if first.category then do; *or possibly first.clinic?;
> > strata = 0;
> > rowcount = 0;
> > end;
> > rowcount+1;
> > strata + mod(rowcount,2);
> > run;
> >
> > proc surveyselect data=docs_stratified out=tx_docs rate=0.5 seed=987654;
> > strata clinic category strata;
> > run;
> >
> > *one way to do it using datastep - this does NOT guarantee the extra
> strata
> > have a Tx selection, like the PROC SURVEYSELECT does;
> >
> > data tx_docs cn_docs;
> > set docs_stratified;
> > retain selected;
> > by clinic category strata;
> > if first.strata then do;
> > if round(ranuni(987654),1) = 1 then selected = 1;
> > else selected = 0;
> > end;
> > else selected = 1 - selected;
> > if selected = 1 then output tx_docs;
> > else output cn_docs;
> > run;
> >
> > -Joe
> >
> >
> > On Tue, Jun 30, 2009 at 7:56 PM, Pardee, Roy <pardee.r@ghc.org> wrote:
> >
> > > Hey All,
> > >
> > > I've got a set of about 100 physicians I need to randomize into
> treatment &
> > > control conditions, blocking on clinic and the # of patients in their
> panel
> > > over age 50. The instructions I have are:
> > >
> > > 1. Assign each doc a into large/medium/small panel size
> > > category, on the basis of # of patients over age 50
> > > (pts_over_50).
> > >
> > > 2. Add some extra, pretend randomization slots to each
> > > clinic/category for future docs to slot into if/when new
> > > docs are added later.
> > >
> > > 3. Sort the docs by clinic & then descending pts_over_50.
> > >
> > > 4. Proceeding from the top of the list, randomize two
> > > docs at a time--so for example, flip a coin and if the
> > > result is heads, the first doc becomes tx & second is
> > > control, if result is tails, then first doc is control &
> > > second is tx.
> > >
> > > I'm wondering if SURVEYSELECT can do the basic randomization for me &
> save
> > > me the row-by-row programming? (I'm content to deal w/adding the extra
> slots
> > > in step 2).
> > >
> > > Looking at the sample in the help file entry for SURVEYSELECT, the
> below
> > > call seems promising.
> > >
> > > * =============================== ;
> > >
> > > data docs ;
> > > input
> > > @1 doc $char4.
> > > @7 clinic $char9.
> > > @17 pts_over_50
> > > @23 category $char6.
> > > ;
> > > datalines ;
> > > bob central 398 small
> > > mary central 400 small
> > > erin central 505 small
> > > john central 1000 medium
> > > lori central 1400 medium
> > > suzy central 2000 large
> > > roy central 5000 large
> > > carl eastside 391 small
> > > jane eastside 4000 large
> > > jess eastside 3999 large
> > > ;
> > > run ;
> > >
> > > proc sort ;
> > > by clinic DESCENDING pts_over_50 ;
> > > run ;
> > >
> > > proc print ;
> > > run ;
> > >
> > > proc surveyselect data = docs method = sys rate = 0.5 seed = 987654
> out =
> > > tx_docs ;
> > > strata clinic category ;
> > > control pts_over_50 ;
> > > run ;
> > >
> > > proc print data = tx_docs ;
> > > run ;
> > >
> > > * =============================== ;
> > >
> > > (So--anybody in the tx_docs dataset gets assigned to the treatment
> > > condition, and the balance are controls.)
> > >
> > > Do any of you understand SURVEYSELECT well enough to say whether that
> call
> > > is equivalent to the instructions below? Is there a better way? Or
> should
> > > I just suck it up & try to literally carry out the instructions I have?
> > >
> > > I realize this is a long-shot, but figured I'd try...
> > >
> > > Many thanks!
> > >
> > > -Roy
> > >
> > >
> > > GHC Confidentiality Statement
> > >
> > > This message and any attached files might contain confidential
> information
> > > protected by federal and state law. The information is intended only
> for the
> > > use of the individual(s) or entities originally named as addressees.
> The
> > > improper disclosure of such information may be subject to civil or
> criminal
> > > penalties. If this message reached you in error, please contact the
> sender
> > > and destroy this message. Disclosing, copying, forwarding, or
> distributing
> > > the information by unauthorized individuals or entities is strictly
> > > prohibited by law.
> > >
>
|