Date: Fri, 9 Mar 2012 11:20:14 -0800
Reply-To: Fareeza Khurshed <fkhurshed@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Fareeza Khurshed <fkhurshed@GMAIL.COM>
Subject: Program Logic for subsetting dataset
Content-Type: text/plain; charset=ISO-8859-1
So I'm doing some statistical analysis on data requiring a jack knife
procedure. Previously I was using the 'Don't be Loopy' method and creating
a very large dataset and then using by/group processing to get my results.
Unfortunately my dataset is now too big for the computers I have access to
(80GB HD). I've asked for a new computer but I work for the government so
that'll take a while.
I need a solution that does the first step of the process, the remaining
steps can be reused.
My data looks like:
Group ID random1 random2 random3
A 8934 ...
A 4567 ...
A 2343 ...
B 3452 ...
B 3467 ...
B 3342 ...
C 1234 ...
C 2232 ...
C 3234...
The following are my analysis steps:
1. For each group get the current sample which excludes 1 observation (ie
Group A will have first group where id are (4567, 2343))
2 For each group run my stats, Proc SQL code already written for this
3. Append to result table
4. Go back to 1 and get the next sample and repeat until all are done.
In second loop sample would be Group A (8934, 2343)
...
And last sample would be Group C IDs 1234, 2232.
I need an efficient way to get my samples, I was thinking of something
using a point= in my datastep but for some reason today my mind is drawing
a blank.
The group size will change over time and they are not equal sized groups.
Any help is appreciated.
Fareeza