From: Nathaniel Wooding on 27 Feb 2010 10:25 Wendi I once tried the mapper and did not get very far but you may have better luck. It looks like there are a number of potential observations here that start with <GRResolvedRequest ReqID= and continue through </GRResolvedRequest>< I have not visually parsed the string carefully enough but if you have a specific set of items that are enclosed within each of these pairs, you may be able to write an input statement line Input a b c @@; And SAS will read the successive sets of these variables. Please tell us a little more about what you are trying to accomplish. Nat Wooding -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Wendi Wright Sent: Saturday, February 27, 2010 9:05 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Reading an XML parsed file I need to read in an XML file that comes all in one string (one line). The current string I am using is currently at length 968,273 and could easily be longer. We will be receiving these strings from an MQ server on our mainframe - we are fetching them to the PC and this is how they appear. The example below (with three records) is only length 3830. I am wondering what would be the best way to read this in. I want to use only a single data step (at most) and have one record per item (there may be multiple items per GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML Mapper before, is this a good option? Just imagine the following all on one line: <GRResolvedRequest ReqID="00036" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0036010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp=" " DelItemID="01104975" ItemParsed="" ItemValue=" .0000" ItemRWO="O" ItemStatus=" " GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp=" " DelItemID="01105812" ItemParsed="" ItemValue=" .0000" ItemRWO="O" ItemStatus=" " GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest><GRResolvedRequest ReqID="00037" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0037010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975" ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest><GRResolvedRequest ReqID="00038" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0038010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975" ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest> CONFIDENTIALITY NOTICE: This electronic message contains information which may be legally confidential and or privileged and does not in any case represent a firm ENERGY COMMODITY bid or offer relating thereto which binds the sender without an additional express written confirmation to that effect. The information is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
From: Alan Churchill on 27 Feb 2010 11:57 Wendi, This is an XML fragment. I tossed in a new element at the top (Records) and pulled it into XMLSpy with no issue. You have 3 records (parent elements) below and things appear to be well-formed. For SAS, I think the only feasible choice you have is the XML Mapper. Supposedly, it is much better in SAS 9.2 though I haven't had a chance to work with it. This is a simple exercise in my main language, C#, since it has very robust XML support. The records would probably break down in 1-3 lines of code, for example. One option would be to use a language such as C# or Java to handle the XML portion and pump it into SAS but you may not be comfortable with that route. Work with the XML Mapper, in that case, if you are using SAS 9.2 or higher. The bottom line is that you need to convert the hierarchical XML into a row/column description. Alan Alan Churchill Savian www.savian.net Office: (719) 687-5954 Cell: (719) 310-4870 -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Wendi Wright Sent: Saturday, February 27, 2010 7:05 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Reading an XML parsed file I need to read in an XML file that comes all in one string (one line). The current string I am using is currently at length 968,273 and could easily be longer. We will be receiving these strings from an MQ server on our mainframe - we are fetching them to the PC and this is how they appear. The example below (with three records) is only length 3830. I am wondering what would be the best way to read this in. I want to use only a single data step (at most) and have one record per item (there may be multiple items per GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML Mapper before, is this a good option? Just imagine the following all on one line: <GRResolvedRequest ReqID="00036" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0036010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp=" " DelItemID="01104975" ItemParsed="" ItemValue=" .0000" ItemRWO="O" ItemStatus=" " GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp=" " DelItemID="01105812" ItemParsed="" ItemValue=" .0000" ItemRWO="O" ItemStatus=" " GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ est><GRResolvedRequest ReqID="00037" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0037010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975" ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ est><GRResolvedRequest ReqID="00038" ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService" FinalRC="0"><ReqGrpLst><ReqGrpDet ReqGrpID="PCTRITE0038010000000000009"><ItemLst><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879" ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975" ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S" GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044" DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8" ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374" ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000" ItemRWO="R" ItemStatus="1" GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ est>
From: "Hoyle, Larry" on 1 Mar 2010 14:44 There are many tools for reading XML files, but don't sell the SAS XML Mapper short. As Alan mentioned, to make your example valid XML you should have something like <?xml version="1.0" encoding="UTF-8"?> <records> <GRResolvedRequest ReqID="00036" ReqEndpoint= ..... the rest of your example follows </records> If you open this XML file in XML Mapper you can choose Tools... Automap using XML and XML Mapper will create a complete relational structure from the hierarchical XML structure. I'll be doing a paper on this at SAS Global Forum, for an early peek see http://www.ipsr.ku.edu/ksdata/sashttp/SGF2010/ Larry Hoyle Associate Scientist Institute for Policy & Social Research, University of Kansas 1541 Lilac Lane, Blake 607 Lawrence, KS 66045-3129 http://www.ipsr.ku.edu > -----Original Message----- > > Date: Mon, 1 Mar 2010 09:25:03 -0700 > From: Alan Churchill <alan.churchill(a)SAVIAN.NET> > Subject: Re: Reading an XML parsed file > > And, as another possibility, you can use C# to work with XML as well. C# > has > a wonderful feature called LINQ which allows you to use SQL-like > constructs > for your XML parsing. > > XDocument xd = XDocument.Load(@"c:\temp\mydoc.xml"); > > var recs = (from p in xd.Elements("Element1") > select new > { > myAttr1 = p.Attribute("attribute1").Value, > myAttr2 = p.Attribute("attribute2").Value > }); > > > > When run, recs would contain all elements of type Element1 and their 2 > attributes. It can get a lot more complex than the above but it can reach > as > deep into the XML as desired. > > It depends on what you are comfortable with and the platform you are > running > on. > > Alan > > Alan Churchill **********************************
From: Alan Churchill on 1 Mar 2010 16:00 Larry, See you at your presentation. I am curious about the changes in XML Mapper under 9.2 and have been awaiting it. Alan Alan Churchill Savian www.savian.net Office: (719) 687-5954 Cell: (719) 310-4870 -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Hoyle, Larry Sent: Monday, March 01, 2010 12:45 PM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: Reading an XML parsed file There are many tools for reading XML files, but don't sell the SAS XML Mapper short. As Alan mentioned, to make your example valid XML you should have something like <?xml version="1.0" encoding="UTF-8"?> <records> <GRResolvedRequest ReqID="00036" ReqEndpoint= ..... the rest of your example follows </records> If you open this XML file in XML Mapper you can choose Tools... Automap using XML and XML Mapper will create a complete relational structure from the hierarchical XML structure. I'll be doing a paper on this at SAS Global Forum, for an early peek see http://www.ipsr.ku.edu/ksdata/sashttp/SGF2010/ Larry Hoyle Associate Scientist Institute for Policy & Social Research, University of Kansas 1541 Lilac Lane, Blake 607 Lawrence, KS 66045-3129 http://www.ipsr.ku.edu > -----Original Message----- > > Date: Mon, 1 Mar 2010 09:25:03 -0700 > From: Alan Churchill <alan.churchill(a)SAVIAN.NET> > Subject: Re: Reading an XML parsed file > > And, as another possibility, you can use C# to work with XML as well. C# > has > a wonderful feature called LINQ which allows you to use SQL-like > constructs > for your XML parsing. > > XDocument xd = XDocument.Load(@"c:\temp\mydoc.xml"); > > var recs = (from p in xd.Elements("Element1") > select new > { > myAttr1 = p.Attribute("attribute1").Value, > myAttr2 = p.Attribute("attribute2").Value > }); > > > > When run, recs would contain all elements of type Element1 and their 2 > attributes. It can get a lot more complex than the above but it can reach > as > deep into the XML as desired. > > It depends on what you are comfortable with and the platform you are > running > on. > > Alan > > Alan Churchill **********************************
From: Chang Chung on 1 Mar 2010 17:13 On Sat, 27 Feb 2010 09:05:05 -0500, Wendi Wright <wendi_wright(a)CTB.COM> wrote: >I need to read in an XML file that comes all in one string (one line). The >current string I am using is currently at length 968,273 and could easily be >longer. We will be receiving these strings from an MQ server on our >mainframe - we are fetching them to the PC and this is how they appear. The >example below (with three records) is only length 3830. I am wondering what >would be the best way to read this in. I want to use only a single data >step (at most) and have one record per item (there may be multiple items per >GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML >Mapper before, is this a good option? .... Hi, Wendi: It will depends on the platform, but at least for windows, this is very easy. You can specify a very large number to the lrecl= option of the infile statement(up to 1G). So why not just read the one line input data and separate them into shorter lines using delimiters? You say it is an XML document. If indeed so, then a good candidate for the line delimiter is the bracket characters ("<" or ">"). This is because a well-formed xml doc should have already quoted them to "<" and ">" except in the tags. Suppose that our input file is wendi.xml, then with appropriate lengths here and there, something like the following will do the basics. Unquoting and converting values are left as an exercise to the interested readers. Below ran on sas 9.2(TS1M0) on windows. HTH. Cheers, Chang %let pwd = %sysfunc(pathname(WORK)); %put pwd=&pwd.; x cd &pwd.; filename wendi "wendi.xml"; filename lined "lined.txt"; /* read the input line with a very big buffer and parse each node out to its own line using dlm=, assuming that each node is at most 500 chars. */ data _null_; infile wendi lrecl=1000000 dlm="<>"; file lined lrecl=500; input c :$500. @@; put c; run; /* read back into a data set. assuming this regular hierarchical structure remains. */ data wendi; length ReqId ReqEndpoint FinalRC ReqGrpID DocComCD TstID TstFmCD TstLvlCD ItemNum DelElementID ItemRawRsp DelItemID ItemParsed ItemValue ItemRWO ItemStatus GRRuleIDs $200; retain _all_; infile lined lrecl=500; drop node; input node :$50. @; select(node); when("GRResolvedRequest") input (ReqId--FinalRC)(=); when("ReqGrpDet") input ReqGrpID=; when("ItemDet") do; input (DocComCD--GRRuleIDs) (=); output; end; otherwise; end; run; /* clean up */ filename wendi clear; filename lined clear; /* check */ proc print data=wendi; run;
|
Pages: 1 Prev: Reading an XML parsed file Next: how to prevent SAS macro from interpreting "&" |