LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2005, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 16 Jan 2005 14:44:06 -0500
Reply-To:     "Zack, Matthew M." <MMZ1@CDC.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Zack, Matthew M." <MMZ1@CDC.GOV>
Subject:      Re: surveyselect question
Content-Type: text/plain; charset="us-ascii"

What if you randomly select patients and all their visits without PROC SURVEYSELECT?

* Sort patient visits; * by patient ID;

proc sort; by pt; run;

* Generate a uniform random number for each patient;

data two(drop=rnseed); retain rnseed 6093141 rn; set; by pt; if (first.pt eq 1) then rn=uniform(rnseed); output two; run;

* Sort patient visits; * by patient ID, visit ID, and ascending uniform random number;

proc sort data=two; by pt visit rn; run;

* Select about 50 [+- 2 visits so that range=48 to 52] total patient visits; * Add five visits (possibly from different patients) after the 50 above are selected; * where SBP=. or DBP=.;

data visit50(drop=rn lstvisit nmissbp); retain lstvisit nmissbp 0; set two; by pt visit; select; when (lstvisit eq 0) do; if ((ABS(50-_n_) le 2) and (last.pt eq 1)) then lstvisit=1; output visit50; end; when (lstvisit eq 1) do; if ((sbp eq .) or (dbp eq .)) then do; nmissbp=nmissbp+1; if (nmissbp le 5) then output visit50; else lstvisit=2; end; end; otherwise stop; end; run;

* Select about 20% of the input data set; * Add five visits (possibly from different patients) after the above 20% are selected; * where SBP=. or DBP=.;

data visit20p(drop=rn nmissbp); retain nmissbp 0; set two; by pt visit; select; when (rn le 0.20) output visit20p; otherwise do; if ((sbp eq .) or (dbp eq .)) then do; nmissbp=nmissbp+1; if (nmissbp le 5) then output visit20p; else stop; end; end; end; run;

Matthew Zack

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Scott Sent: Sunday, January 16, 2005 1:12 AM To: SAS-L@LISTSERV.UGA.EDU Subject: surveyselect question

Hi,

I've read various posts about SURVEYSELECT and random samples in the archives, but couldn't find the answer to my problem, thus this post...

Say I have a dataset:

PT VISIT SBP DBP, where

PT = patient VISIT = visit number, say 1 - 4, which may be incomplete for a given PT, i.e. could be 1; 1,2,4; 1,2,3; 1,3; etc. SBP = systolic blood pressure DBP = diastolic blood pressure (both BP's could have missing values)

I'd like to sample this dataset as follows:

1. Sample has "around" say 50 observations in total.

2. Sample has say 20% of observations from input data set.

In both of these samples, *** ALL observations for a given PT are included ***, i.e. if PT 7 is one of the patients randomly selected, then all visits for that PT are included in the random sample.

3. #1 and #2 above, augmented by say 5 random observations where either SBP, DBP, or both have a missing value.

For #3, I don't care if I make two passes over the data, but one pass would be nice.

IOW, in "pseudocode":

1. If each PT had 4 visit records, I would have either 12 (48) or 13 (52) observations in the sample dataset, since I specified a sample size of around 50.

2. If each PT had 4 visit records, and the total input dataset is 1000 observations, I would have 200 observations in the sample dataset, comprised of 50 PTs with 4 visits each.

3.(1) 12 or 13 random patients, plus 5 observations where SBP, DBP, or both were missing.

3.(2) 50 random patients, plus 5 observations where SBP, DBP, or both were missing.

I've played with SURVEYSELECT, but can't figure out how to get all records for a given PT to be included in the output.

Note that this sampling is for QC tests of code algorithms, not for further statistical analyses of the resulting sample.

Thanks, Scott


Back to: Top of message | Previous page | Main SAS-L page