Date: Wed, 29 Oct 2008 17:04:06 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: i/o errors on large data sets when merging
In-Reply-To: <166c7f070810291406n6ff71228q1e837275675c9e5b@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
If you're on a FAT32 drive, 4GB is the file size limit ... if you're on
NTFS, it's 16 TB (well over your size). If the original file is 33GB then
that's probably not the problem, unless the merge would yield a 16 TB file
(unlikely), or your work directory is on a FAT32 partition (hopefully
not?). I suggest assuming the index is corrupt and using the SQL hash as
other posters have suggested.
-Joe
On Wed, Oct 29, 2008 at 4:06 PM, Augustina ogbonnaya
<augustinao@gmail.com>wrote:
> unfortunately i deleted the files. but the large data set i am merging is
> about 33,506,952 KB and the small one is 113KB. I just pasted the error
> message i was getting. the error message is below.
> thanks.
>
> proc datasets library=pci;
> NOTE: Writing HTML Body file:
> sashtml2.htm
>
> Directory
>
>
> Libref
> PCI
> Engine
> V9
> Physical Name R:\The Med Co\Premier Database\Data
> cut
> File Name R:\The Med Co\Premier Database\Data
> cut
>
>
>
>
>
> Member
> # Name Type File Size Last
> Modified
>
>
> 1 ALL_INDEX_DATA DATA 580666368
> 17Oct08:01:00:51
> 68 PCI_RACE DATA 5120
> 07Oct08:22:16:12
> 82 modify
> pci_patbill_cleaned;
> 83 index create
> std_chg_code;
> NOTE: Simple index std_chg_code has been
> defined.
> 84
> run;
>
>
> NOTE: MODIFY was successful for
> PCI.PCI_PATBILL_CLEANED.DATA.
> 84 !
> quit;
>
>
> NOTE: PROCEDURE DATASETS used (Total process
> time):
> real time
> 1:22:26.32
> cpu time
> 9:11.26
>
>
>
>
> 85 proc sort
> data=pci_med.anticoag_platelet_thrombo;
> 86 by
> std_chg_code;
> 87
> run;
>
>
> NOTE: There were 147 observations read from the data
> set
>
> PCI_MED.ANTICOAG_PLATELET_THROMBO.
> NOTE: The data set PCI_MED.ANTICOAG_PLATELET_THROMBO has
> 147
> observations and 33
> variables.
> NOTE: PROCEDURE SORT used (Total process
> time):
> real time 0.09
> seconds
> cpu time 0.00
> seconds
>
>
>
>
>
> 97 data
> pci_med.pat_anticoag_platelet_thrombo;
> 98
> merge
> 99 pci.pci_patbill_cleaned
> (in=a)
> 100 pci_med.anticoag_platelet_thrombo
> (in=b);
> 101 by
> std_chg_code;
> 102 if a and
> b;
> 103
> run;
>
>
> ERROR: An I/O error has occurred on file
> PCI.PCI_PATBILL_CLEANED.DATA.
> NOTE: The data step has been abnormally
> terminated.
> NOTE: The SAS System stopped processing this step because of
> errors.
> NOTE: There were 4367649 observations read from the data
> set
>
> PCI.PCI_PATBILL_CLEANED.
> NOTE: There were 1 observations read from the data
> set
>
> PCI_MED.ANTICOAG_PLATELET_THROMBO.
> WARNING: The data set PCI_MED.PAT_ANTICOAG_PLATELET_THROMBO may
> be
> incomplete. When this step was stopped there were
> 0
> observations and 42
> variables.
> WARNING: Data set PCI_MED.PAT_ANTICOAG_PLATELET_THROMBO was not
> replaced
> because this step was
> stopped.
> NOTE: DATA statement used (Total process
> time):
> real time
> 16:28.09
> cpu time 49.01 seconds
>
>
>
>
>
> On Wed, Oct 29, 2008 at 4:52 PM, Joe Matise <snoopy369@gmail.com> wrote:
>
>> How big is the file you're creating? Filesystem limits are very large,
>> but for some OS's it's in the gigabytes for a single file. Can you paste
>> the exact error message you get, and how big the .lck file is?
>>
>> Can you read in the original file (do some simple operation on it, like
>> data test; set (large dataset); by (byvar); x = 1; run; or something). Does
>> that still have an error?
>>
>> -Joe
>>
>>
>> On Wed, Oct 29, 2008 at 1:26 PM, Sassy <AugustinaO@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> i have been getting i/o error when i try to merge one large data file
>>> with (about 4,000,000 claim lines) and another with 147 claim lines
>>> and i keep getting an error message. the large dataset was index and
>>> the small dataset was sorted. the error message says that the large
>>> dataset has an i/o error. I have enough space on my disk about 1.7TB.
>>> I'm pretty sure the original file was not corrupt before i started the
>>> merge. Does anyone have an idea what is going on?
>>>
>>
>>
>
|