| Date: | Thu, 3 Feb 2011 12:22:35 -0800 |
| Reply-To: | Sterling Paramore <gnilrets@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Sterling Paramore <gnilrets@GMAIL.COM> |
| Subject: | Re: Out of memory? |
| In-Reply-To: | <8DE344F230A946409D021626F849A048219C84C7BB@WAFEDIXMCMS12.corp.weyer.pri> |
| Content-Type: | text/plain; charset=ISO-8859-1 |
|---|
Yikes! It's more like:
Claim_id = 10,000,000
Claim_Diag_primary = 20,000
Claim_Prim_Hosp_Proc_Cd = 15,000
Claim_Bill_Type = 500
That would end up 1.5 * 10^18 combinations!
I think I'll go with the SQL.
On Thu, Feb 3, 2011 at 12:11 PM, Jordan, Lewis <
Lewis.Jordan@weyerhaeuser.com> wrote:
> One way would be to subset the data. Since the class statement essentially
> creates a unique classification for each "Claim_Id Claim_Diag_Primary
> Claim_Prim_Hosp_Proc_Cd Claim_Bill_Type" you could create a new variable
> called "my_id" then create subsets. For example, let's say you have the
> following numbers of categories for each variable on the "class" statement:
>
> Claim_id = 5
> Claim_Diag_primary = 10
> Claim_Prim_Hosp_Proc_Cd = 3
> Claim_Bill_Type = 4
>
> You would then have 5*10*3*4 = 600 unique classifications. Since you
> already sorted the data, you could do the following to create your new
> variable:
>
> /*
> I'm assuming your sort was of the form:
> proc sort; by Claim_Id Claim_Diag_Primary Claim_Prim_Hosp_Proc_Cd
> Claim_Bill_Type
> */
>
> Data new;set old;by Claim_Id Claim_Diag_Primary Claim_Prim_Hosp_Proc_Cd
> Claim_Bill_Type;
> If (_n_=1) then my_id=1;
> Else if first.Claim_Bill_Type then my_id=my_id+1;
> Retain my_id;
> Run;
>
> %macro sums();
>
> %do i=1 to 600;
>
> proc means data=new(where=(my_id=&i)) nway noprint missing;
> var Claim_Counter ClaimLine_Counter ClaimLine_Paid_Amt
> ClaimLine_COB_Paid_Amt
> Claim_Interest_Amt ClaimLine_Savings_Amt;
> output out = WORK._BMClaims (drop = _TYPE_ _FREQ_) sum()=;
> run;
>
> Proc append data=work._BMClaims base=all_sums;
> Run;
>
> %end;
>
> %mend sums;
>
>
>
>
>
> *****************************
> Lewis Jordan
> Weyerhaeuser:
> Southern Timberlands R&D
> Cell (Primary): 662-889-4514
> Office: 662-245-5227
> lewis.jordan@weyerhaeuser.com
> *****************************
>
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Sterling Paramore
> Sent: Thursday, February 03, 2011 1:55 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Out of memory?
>
> Dear SAS-L,
>
> I assume the following is failing due to the system running out of memory
> (way to go SAS - not only do you not tell me what the problem is, this
> doesn't even register as a real ERROR, only a warning) because if I run it
> with an OBS= statement, it completes just fine.
>
> To solve the problem, I've tried first sorting the dataset (6 minutes),
> then doing proc means with a by (7 minutes). I also tried using proc sql
> and it required only 5 minutes.
>
> Any other ideas? I'd prefer not to have to rewrite a bunch of my proc
> means as sql.
>
> Thanks,
> Sterling
>
>
>
> 16
> 17 proc means noprint missing nway data = WORKERR._BMClaims_Concat;
> 18 var Claim_Counter ClaimLine_Counter ClaimLine_Paid_Amt
> ClaimLine_COB_Paid_Amt
> 19 Claim_Interest_Amt ClaimLine_Savings_Amt;
> 20 class Claim_Id Claim_Diag_Primary Claim_Prim_Hosp_Proc_Cd
> Claim_Bill_Type;
> 21 output out = WORK._BMClaims (drop = _TYPE_ _FREQ_) sum()=;
> 22 run;
>
> NOTE: The SAS System stopped processing this step because of errors.
> NOTE: There were 70619390 observations read from the data set
> WORKERR._BMCLAIMS_CONCAT.
> WARNING: The data set WORK._BMCLAIMS may be incomplete. When this step was
> stopped there were 0 observations and 10 variables.
> NOTE: PROCEDURE MEANS used (Total process time):
> real time 2:23.15
> cpu time 3:08.01
>
>
>
|