| Date: | Wed, 29 Apr 2009 04:48:41 -0700 |
| Reply-To: | valkrem@yahoo.com |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Val Krem <valkrem@YAHOO.COM> |
| Subject: | Re: Split data |
|
| In-Reply-To: | <b7a7fa630904280651g6ee374a2tbb0fb938cedbe6fb@mail.gmail.com> |
| Content-Type: | text/plain; charset=utf-8 |
Hi Joe and ohters,
I want to split one huge data set into two parts randomly by
a given variable say by age in the following data set.
Id Age
1
26
2
35
3
26
4
27
5
35
6
24
7
21
8
20
9
23
10
19
The two data sets should have approximately
equal number of observations.The output may look like in the
following way.
Group A
Age id
26 1
26 3
27 4
24
6
19 10
Group B
Age id
35 2
35 5
21 7
20
8
23 9 As suggested proc surveyselect may be an appealing approach. But I do not know much about it. Could do you help here?
Thanks in advance
--- On Tue, 4/28/09, Joe Matise <snoopy369@GMAIL.COM> wrote:
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: Split data
To: SAS-L@LISTSERV.UGA.EDU
Date: Tuesday, April 28, 2009, 6:51 AM
You could look at PROC SURVEYSELECT, and select out half of the records...
or sort by age and then have an alternating variable (say,
dataset=ifn(ranuni(7)<0.5,1,2); ) which will get you approximately even
distribution but not exact (perhaps off by 1).
-Joe
On Tue, Apr 28, 2009 at 6:58 AM, Val Krem <valkrem@yahoo.com> wrote:
> Dear SAS users'
>
> I want to split one huge data set into two parts randomly by
> a given variable say by age in the following data set.
>
>
>
> Id Age
>
> 1
> 26
>
> 2
> 35
>
> 3
> 26
>
> 4
> 27
>
> 5
> 35
>
> 6
> 24
>
> 7
> 21
>
> 8
> 20
>
> 9
> 23
>
> 10
> 19
>
>
>
> The two data sets should have approximately
> equal number of observations. Remember there are eight unique age groups.
>
>
>
> How do I do that?
>
>
>
>
>
>
>
>
>
>
|