Date: Wed, 24 Jan 2007 16:35:35 -0500
Reply-To: "Lamias, Mark (CDC/CCID/OD) (CTR)" <bnz6@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Lamias, Mark (CDC/CCID/OD) (CTR)" <bnz6@CDC.GOV>
Subject: Re: random sampling by IDs (not observations) from a huge database
Content-Type: text/plain; charset="us-ascii"
Here's an idea if I understand your question correctly. You could you
just generate random uniform integers in the range of your IDs and
output these random numbers to a database. Next, select observations
from the dataset from which you are trying to sample, asking only for
the observations that have id numbers in your random number dataset you
just generated.
Mark J. Lamias
SAIC Statistical Consultant
Office of Informatics
National Center for Preparedness, Detection, and Control of Infectious
Diseases
Coordinating Center for Infectious Diseases
US Centers for Disease Control and Prevention
w: (404) 639-0747
m: (404) 543-1394
f: (404) 639-1391
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Ross
Sent: Friday, January 19, 2007 6:41 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: random sampling by IDs (not observations) from a huge database
Hi,
Suppose you have a huge database where for the variable ID you have
multiple observations. You want to randomly sample 10 IDs from all the
IDs in the database (so that each ID in the database has the same prob
to belong to make it), and create a file with the observations of these
sorted IDs.
I think proc surveyselect can help, but in order to do that it sorts the
database, and that will take too much time.
Thanks!
Ross
The information contained in this e-mail is confidential and/or
proprietary
to Capital One and/or its affiliates. The information transmitted
herewith
is intended only for use by the individual or entity to which it is
addressed. If the reader of this message is not the intended recipient,
you are hereby notified that any review, retransmission, dissemination,
distribution, copying or other use of, or taking of any action in
reliance
upon this information is strictly prohibited. If you have received this
communication in error, please contact the sender and delete the
material
from your computer.
|