LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2008, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 30 Oct 2008 10:38:15 -0500
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: i/o errors on large data sets when merging
Comments: To: Sassy <AugustinaO@GMAIL.COM>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
              reply-type=original

Yes, it IS something to worry about, because it says that you have a variable on each of datasets that have the same name. You should either rename that variable before you run proc sql, or you can specify variables individually rather than using the * notation, such as

select r1.std_chg_code as patbill_std_chg_code, r2.std_chg_code as platelet_std_chg_code,

One thing our Hospital does in its database tables is to always put a three letter prefix to every variable representing the table it is from, such as vst_ for all fields in the visit table, pdx_ for all fields in the prescriptions table, and so forth. I liked it very much after I got used to it, as it would always ensure that if you joined two tables together that they would all have unique variable names.

-Mary

----- Original Message ----- From: Sassy To: SAS-L@LISTSERV.UGA.EDU Sent: Thursday, October 30, 2008 9:50 AM Subject: Re: i/o errors on large data sets when merging

hello all,

have one more question about the proc sql syntax.

proc sql; create table pci_med.pat_anticoag_platelet_thrombo AS SELECT r1.*,r2.* FROM pci.pci_patbill_clean as r1 inner join pci_med.anticoag_platelet_thrombo as r2 on r1.std_chg_code=r2.std_chg_code; QUIT;

I ran this and in the log i got a warning saying : Variable std_chg_code already exists on file pci_med.pat_anticoag_platelet_thrombo. this is something I should be worried about?

On 30 Okt., 15:10, Sassy <Augusti...@gmail.com> wrote: > Thanks guys this has been really helpful. I will look into the harsh > also as > this is the first time i'm hearing of it. > > On Oct 29, 6:04 pm, snoopy...@GMAIL.COM (Joe Matise) wrote: > > > > > If you're on a FAT32 drive, 4GB is the file size limit ... if you're on > > NTFS, it's 16 TB (well over your size). If the original file is 33GB > > then > > that's probably not the problem, unless the merge would yield a 16 TB > > file > > (unlikely), or your work directory is on a FAT32 partition (hopefully > > not?). I suggest assuming the index is corrupt and using the SQL hash > > as > > other posters have suggested. > > -Joe > > > On Wed, Oct 29, 2008 at 4:06 PM, Augustina ogbonnaya > > <augusti...@gmail.com>wrote: > > > > unfortunately i deleted the files. but the large data set i am > > > merging is > > > about 33,506,952 KB and the small one is 113KB. I just pasted the > > > error > > > message i was getting. the error message is below. > > > thanks. > > > > proc datasets library=pci; > > > NOTE: Writing HTML Body file: > > > sashtml2.htm > > > > Directory > > > > Libref > > > PCI > > > Engine > > > V9 > > > Physical Name R:\The Med Co\Premier Database\Data > > > cut > > > File Name R:\The Med Co\Premier Database\Data > > > cut > > > > Member > > > # Name Type File Size Last > > > Modified > > > > 1 ALL_INDEX_DATA DATA 580666368 > > > 17Oct08:01:00:51 > > > 68 PCI_RACE DATA 5120 > > > 07Oct08:22:16:12 > > > 82 modify > > > pci_patbill_cleaned; > > > 83 index create > > > std_chg_code; > > > NOTE: Simple index std_chg_code has been > > > defined. > > > 84 > > > run; > > > > NOTE: MODIFY was successful for > > > PCI.PCI_PATBILL_CLEANED.DATA. > > > 84 ! > > > quit; > > > > NOTE: PROCEDURE DATASETS used (Total process > > > time): > > > real time > > > 1:22:26.32 > > > cpu time > > > 9:11.26 > > > > 85 proc sort > > > data=pci_med.anticoag_platelet_thrombo; > > > 86 by > > > std_chg_code; > > > 87 > > > run; > > > > NOTE: There were 147 observations read from the data > > > set > > > > PCI_MED.ANTICOAG_PLATELET_THROMBO. > > > NOTE: The data set PCI_MED.ANTICOAG_PLATELET_THROMBO has > > > 147 > > > observations and 33 > > > variables. > > > NOTE: PROCEDURE SORT used (Total process > > > time): > > > real time 0.09 > > > seconds > > > cpu time 0.00 > > > seconds > > > > 97 data > > > pci_med.pat_anticoag_platelet_thrombo; > > > 98 > > > merge > > > 99 pci.pci_patbill_cleaned > > > (in=a) > > > 100 pci_med.anticoag_platelet_thrombo > > > (in=b); > > > 101 by > > > std_chg_code; > > > 102 if a and > > > b; > > > 103 > > > run; > > > > ERROR: An I/O error has occurred on file > > > PCI.PCI_PATBILL_CLEANED.DATA. > > > NOTE: The data step has been abnormally > > > terminated. > > > NOTE: The SAS System stopped processing this step because of > > > errors. > > > NOTE: There were 4367649 observations read from the data > > > set > > > > PCI.PCI_PATBILL_CLEANED. > > > NOTE: There were 1 observations read from the data > > > set > > > > PCI_MED.ANTICOAG_PLATELET_THROMBO. > > > WARNING: The data set PCI_MED.PAT_ANTICOAG_PLATELET_THROMBO may > > > be > > > incomplete. When this step was stopped there were > > > 0 > > > observations and 42 > > > variables. > > > WARNING: Data set PCI_MED.PAT_ANTICOAG_PLATELET_THROMBO was not > > > replaced > > > because this step was > > > stopped. > > > NOTE: DATA statement used (Total process > > > time): > > > real time > > > 16:28.09 > > > cpu time 49.01 seconds > > > > On Wed, Oct 29, 2008 at 4:52 PM, Joe Matise <snoopy...@gmail.com> > > > wrote: > > > >> How big is the file you're creating? Filesystem limits are very > > >> large, > > >> but for some OS's it's in the gigabytes for a single file. Can you > > >> paste > > >> the exact error message you get, and how big the .lck file is? > > > >> Can you read in the original file (do some simple operation on it, > > >> like > > >> data test; set (large dataset); by (byvar); x = 1; run; or > > >> something). Does > > >> that still have an error? > > > >> -Joe > > > >> On Wed, Oct 29, 2008 at 1:26 PM, Sassy <Augusti...@gmail.com> wrote: > > > >>> Hi all, > > > >>> i have been getting i/o error when i try to merge one large data > > >>> file > > >>> with (about 4,000,000 claim lines) and another with 147 claim lines > > >>> and i keep getting an error message. the large dataset was index > > >>> and > > >>> the small dataset was sorted. the error message says that the large > > >>> dataset has an i/o error. I have enough space on my disk about > > >>> 1.7TB. > > >>> I'm pretty sure the original file was not corrupt before i started > > >>> the > > >>> merge. Does anyone have an idea what is going on?- Hide quoted > > >>> text - > > > - Show quoted text -- Zitierten Text ausblenden - > > - Zitierten Text anzeigen -


Back to: Top of message | Previous page | Main SAS-L page