Date: Wed, 6 Oct 2010 09:43:46 -0600
Reply-To: Alan Churchill <alan.churchill@SAVIAN.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Alan Churchill <alan.churchill@SAVIAN.NET>
Subject: Re: Resubmit: Reading XML files via XML92 getting 0 observation
Content-Type: text/plain; charset="iso-8859-1"
Some SAX vs. DOM explanation:
Do both XML files adhere to the same schema?
I have not worked with SAS XML Maps in a while so just trying to help out on
the XML bit for now.
From: Gerstle, John (CDC/OID/NCHHSTP) [mailto:email@example.com]
Sent: Wednesday, October 06, 2010 8:15 AM
To: Alan Churchill; SAS-L@LISTSERV.UGA.EDU
Subject: RE: Resubmit: Reading XML files via XML92 getting 0 observation
I have XMLSpy (and DiffDog) and have tried looking for XML code issues but
haven't found anything definitive. The problem file is over 280k lines so
not easy to eyeball. I compared it with a smaller XML file that SAS has no
issue reading and really haven't found anything besides, what looks like,
some child-child-child nodes not aligned but that could be data driven (some
clients have the data and some do not).
SAX vs Dom - could you define these terms?
Scientific Information Specialist
Centers for Disease Control and Prevention NCHHSTP\DHAP-SE\QSDMB\Data
Email: yzg9 at cdc dot gov
Socrates, proclaimed: "I came to know one thing; that I know nothing".
"Every question I answer will simply lead to another question."
>>[mailto:firstname.lastname@example.org] On Behalf Of Alan Churchill
>>Sent: Tuesday, October 05, 2010 6:13 PM
>>Subject: RE: Resubmit: Reading XML files via XML92 getting 0
>>Look at SAX vs Dom on why access is limited. It depends on the engine
>>It is hard to guess as to what is happening w/o seeing the XML in
>>Have you opened up the files in something like XmlSpy to look for
>>From: Gerstle, John (CDC/OID/NCHHSTP) [mailto:yzg9@CDC.GOV]
>>Sent: Tuesday, October 05, 2010 9:26 AM
>>Subject: Resubmit: Reading XML files via XML92 getting 0 observation
>>SAS v9.22, WinXP, XML Mapper
>>I've manually created map file from a complex schema and am using the
>>XML92 engine to read in the XML data files. I have successfully tested
>>this method on 3 XML files, 1 of which is close to 450MB in size.
>>Recently, I received a new sample file (only 14Mb) and now it's
>>failing (well, it's failing in the sense that no data observations are
>>being read by SAS). Interestingly, within XML Mapper, I can use the
>>Table View tab to see the data, correctly mapped. But Base SAS is
>>unable to replicate this. Even SAS Explorer is unable to open any
'tables' to view.
>>libname incoming xml92 "&xml_file"
>>proc print data=incoming.x_headerinfo; run;
>>...where the x_headerinfo is the first node of data in the file.
>>NOTE: Processing XMLMap version 1.9.
>>NOTE: Libref INCOMING was successfully assigned as follows:
>> Engine: XML92
>> Physical Name: W:\Data_Management\test.xml
>>2111 proc print data=incoming.x_headerinfo; run;
>>NOTE: Access by observation number not available. Observation numbers
>>will be counted by PROC PRINT.
>>NOTE: No observations in data set INCOMING.x_headerinfo.
>>NOTE: There were 0 observations read from the data set
>>I've added an End Path for the table, which is the same as the Path,
>>set as End. And added an automatic enumerator to the table. No luck
>>on the Base SAS side but I see correct mapping in the Table View of XML
>>I've been researching this problem for the past 2 weeks and have read
>>several really good papers on the subject (Larry Hoyle's recent papers
>>and Lex Jensen's workshop at SGF2010), but haven't found reference to
>>this specific problem.
>>I feel that I've missed something in my map, though the map does work
>>for the other data files, so it's possible that the data file in
>>question is problematic.
>>1) What are the reasons why Base SAS is unable to achieve access by
>>observation number in an XML file? (something to do with Sequential
>>Reading of the file instead of Random reading?)
>>2) Any references to suggest?
>>3) Any suggestions for the above problem?
>>I'm considering having the sender re-create their XML file. the only
>>thing I can find in their file that might be problematic is that the
>>order of nodes is not the same as one of the other test files that does
>>Scientific Information Specialist
>>Centers for Disease Control and Prevention NCHHSTP\DHAP-SE\QSDMB\Data
>>Email: yzg9 at cdc dot gov
>>Socrates, proclaimed: "I came to know one thing; that I know nothing".
>>"Every question I answer will simply lead to another question."