|
All the XML files were created and validated based on the same schema file.
Thanks Alan.
John Gerstle
Scientific Information Specialist
Centers for Disease Control and Prevention
NCHHSTP\DHAP-SE\QSDMB\Data Management Team
Phone: 404-639-3980
Fax: 404-639-8642
Email: yzg9 at cdc dot gov
Socrates, proclaimed: "I came to know one thing; that I know nothing".
"Every question I answer will simply lead to another question."
>>-----Original Message-----
>>From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On
>>Behalf Of Alan Churchill
>>Sent: Wednesday, October 06, 2010 11:44 AM
>>To: SAS-L@LISTSERV.UGA.EDU
>>Subject: RE: Resubmit: Reading XML files via XML92 getting 0 observation
>>datasets
>>
>>John,
>>
>>Some SAX vs. DOM explanation:
>>
>>http://www.cs.nmsu.edu/~epontell/courses/XML/material/xmlparsers.html#q1
>>
>>Do both XML files adhere to the same schema?
>>
>>I have not worked with SAS XML Maps in a while so just trying to help out on
>>the XML bit for now.
>>
>>Alan
>>
>>Alan Churchill
>>Savian
>>Work: 719-687-5954
>>Cell: 719-310-4870
>>
>>
>>-----Original Message-----
>>From: Gerstle, John (CDC/OID/NCHHSTP) [mailto:yzg9@cdc.gov]
>>Sent: Wednesday, October 06, 2010 8:15 AM
>>To: Alan Churchill; SAS-L@LISTSERV.UGA.EDU
>>Subject: RE: Resubmit: Reading XML files via XML92 getting 0 observation
>>datasets
>>
>>Alan,
>>I have XMLSpy (and DiffDog) and have tried looking for XML code issues but
>>haven't found anything definitive. The problem file is over 280k lines so
>>not easy to eyeball. I compared it with a smaller XML file that SAS has no
>>issue reading and really haven't found anything besides, what looks like,
>>some child-child-child nodes not aligned but that could be data driven (some
>>clients have the data and some do not).
>>
>>SAX vs Dom - could you define these terms?
>>
>>Thanks
>>
>>John Gerstle
>>Scientific Information Specialist
>>Centers for Disease Control and Prevention NCHHSTP\DHAP-SE\QSDMB\Data
>>Management Team
>>Phone: 404-639-3980
>>Fax: 404-639-8642
>>Email: yzg9 at cdc dot gov
>>Socrates, proclaimed: "I came to know one thing; that I know nothing".
>>
>>"Every question I answer will simply lead to another question."
>>
>>
>>>>-----Original Message-----
>>>>From: owner-sas-l@listserv.uga.edu
>>>>[mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Alan Churchill
>>>>Sent: Tuesday, October 05, 2010 6:13 PM
>>>>To: SAS-L@LISTSERV.UGA.EDU
>>>>Subject: RE: Resubmit: Reading XML files via XML92 getting 0
>>>>observation datasets
>>>>
>>>>John,
>>>>
>>>>Look at SAX vs Dom on why access is limited. It depends on the engine
>>>>chosen.
>>>>
>>>>It is hard to guess as to what is happening w/o seeing the XML in
>>question.
>>>>Have you opened up the files in something like XmlSpy to look for
>>>>differences?
>>>>
>>>>Alan
>>>>
>>>>Alan Churchill
>>>>Savian
>>>>Work: 719-687-5954
>>>>Cell: 719-310-4870
>>>>
>>>>-----Original Message-----
>>>>From: Gerstle, John (CDC/OID/NCHHSTP) [mailto:yzg9@CDC.GOV]
>>>>Sent: Tuesday, October 05, 2010 9:26 AM
>>>>Subject: Resubmit: Reading XML files via XML92 getting 0 observation
>>>>datasets
>>>>
>>>>SAS v9.22, WinXP, XML Mapper
>>>>
>>>>I've manually created map file from a complex schema and am using the
>>>>XML92 engine to read in the XML data files. I have successfully tested
>>>>this method on 3 XML files, 1 of which is close to 450MB in size.
>>>>Recently, I received a new sample file (only 14Mb) and now it's
>>>>failing (well, it's failing in the sense that no data observations are
>>>>being read by SAS). Interestingly, within XML Mapper, I can use the
>>>>Table View tab to see the data, correctly mapped. But Base SAS is
>>>>unable to replicate this. Even SAS Explorer is unable to open any
>>'tables' to view.
>>>>
>>>>Code:
>>>>
>>>>libname incoming xml92 "&xml_file"
>>>> xmlmap="&xml_map"
>>>> xmlschema="&xml_schema"
>>>> xmltype=xmlmap
>>>> xmlmeta=schemadata;
>>>>proc print data=incoming.x_headerinfo; run;
>>>>
>>>>...where the x_headerinfo is the first node of data in the file.
>>>>
>>>>Log:
>>>>NOTE: Processing XMLMap version 1.9.
>>>>NOTE: Libref INCOMING was successfully assigned as follows:
>>>> Engine: XML92
>>>> Physical Name: W:\Data_Management\test.xml
>>>>2111 proc print data=incoming.x_headerinfo; run;
>>>>
>>>>NOTE: Access by observation number not available. Observation numbers
>>>>will be counted by PROC PRINT.
>>>>NOTE: No observations in data set INCOMING.x_headerinfo.
>>>>NOTE: There were 0 observations read from the data set
>>>>INCOMING.x_headerinfo.
>>>>
>>>>
>>>>I've added an End Path for the table, which is the same as the Path,
>>>>set as End. And added an automatic enumerator to the table. No luck
>>>>on the Base SAS side but I see correct mapping in the Table View of XML
>>Mapper.
>>>>
>>>>I've been researching this problem for the past 2 weeks and have read
>>>>several really good papers on the subject (Larry Hoyle's recent papers
>>>>and Lex Jensen's workshop at SGF2010), but haven't found reference to
>>>>this specific problem.
>>>>
>>>>I feel that I've missed something in my map, though the map does work
>>>>for the other data files, so it's possible that the data file in
>>>>question is problematic.
>>>>
>>>>3 Questions:
>>>>1) What are the reasons why Base SAS is unable to achieve access by
>>>>observation number in an XML file? (something to do with Sequential
>>>>Reading of the file instead of Random reading?)
>>>>2) Any references to suggest?
>>>>3) Any suggestions for the above problem?
>>>>
>>>>I'm considering having the sender re-create their XML file. the only
>>>>thing I can find in their file that might be problematic is that the
>>>>order of nodes is not the same as one of the other test files that does
>>work.
>>>>
>>>>
>>>>John Gerstle
>>>>Scientific Information Specialist
>>>>Centers for Disease Control and Prevention NCHHSTP\DHAP-SE\QSDMB\Data
>>>>Management Team
>>>>Phone: 404-639-3980
>>>>Fax: 404-639-8642
>>>>Email: yzg9 at cdc dot gov
>>>>Socrates, proclaimed: "I came to know one thing; that I know nothing".
>>>>
>>>>"Every question I answer will simply lead to another question."
|