|Date: ||Mon, 3 Oct 2005 13:58:06 -0500|
|Reply-To: ||"Nick ." <ni14@MAIL.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||"Nick ." <ni14@MAIL.COM>|
|Subject: ||SURVEYSELECT_how to select samples|
|Content-Type: ||text/plain; charset="iso-8859-1"|
I am familiar with PROC SURVEYSELECT but here is my question I need help with:
I have a dataset of say 100,000 records.
I want to split it into two (or >= 2 groups to be more general) RANDOMLY WITHOUT REPLACEMENT groups.
So I will have two groups: One group will serve as predictive model built-up and the other group will serve as model validation. I know how to do this with data steps but I am alost certain there is a way in SURVEYSELECT which allows to cut a data set into >=2 randomly split (without replacement) groups. No duplicates here!
As a sidenote, what if I had the same question as above but I wanted to do it WITH REPLACEMENT. I am just curious about this, not that I needed right now. And what does with replacement mean? I tried some code from previous archives WITH REPLACEMENT but I don't get any duplicates in my sample. I thought replacement meant one record may be chosen more than once. How do you tell SURVEYSELECT to do this. I better get duplicates when I do this, right?
Sign-up for Ads Free at Mail.com