LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2003, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 29 Jan 2003 13:58:55 -0500
Reply-To:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:   Re: XMLMAP and large files
Comments:   To: Geoff <gcd_smith@HOTMAIL.COM>
Content-Type:   text/plain; charset="iso-8859-1"

I don't have any magic settings that will make transforms from XML documents of some arbitrary form into SAS datasets of some arbitrary form. When you say

"The XMLMAP functionality is used to define the SAS datasets from the XML files as they have a complex structure."

do you mean that the SAS datasets have complex structures or that the XML files have complex structures? Sounds like a bit of both. We have seen many efficiency issues in situations where we are mapping files with repeating segments into SAS 'flat files' with repeating groups of variables. That means that in each case of repeating segments the number of repeating groups of variables has to equal the maximum number of repeating segments per entity. This transformation often creates tens of thousands of variables and leaves acres of space devoted to missing values.

It should not take too much data modelling to estimate size and scope of the result of a transformation. In some cases it makes sense to map XML to a relational database. SAS can then remap what information remains into any required data structure.

Vendors now provide specialized caching systems for XML parsing. That suggests to me that XML parsing has a long way to go before it matches the efficiency of other database transfer methods. XML statndards will probably have to change to get around intrinsic inefficiencies in database transfer rates. All of those tags have to affect storage and processing efficiency. Better mapping technologies can do only so much. Sig

-----Original Message----- From: Geoff [mailto:gcd_smith@HOTMAIL.COM] Sent: Wednesday, January 29, 2003 11:46 AM To: SAS-L@LISTSERV.UGA.EDU Subject: XMLMAP and large files

I are loading XML files into SAS v8.2 (Solaris) using the new XMLMAP technology (downloaded from www.sas.com). The XMLMAP functionality is used to define the SAS datasets from the XML files as they have a complex structure.

It is possible to load in an XML file containing 1000 'records' but a file containing 10000 records causes SAS to hang or to terminate abnormally (core dump). Ideally, I need to load about 300,000 records although 50,000 would be acceptable.

I am also performing a XSLT Transform on the XML file so that I can get around a problem caused by the XMLMAP functionality (this problem has been verified by SAS). I am currently using a Xalan-Java XSLT processor and unfortunately it is taking a very long time to perform the operation, about 17 hours for a 200Mb file. Does anyone know how to make this quicker? (I am using the SAX option).

Any help would be greatly appreciated.

Many thanks, Geoff


Back to: Top of message | Previous page | Main SAS-L page