From: Nathaniel Wooding on
Wendi

I once tried the mapper and did not get very far but you may have better luck. It looks like there are a number of potential observations here that start with <GRResolvedRequest ReqID= and continue through </GRResolvedRequest><

I have not visually parsed the string carefully enough but if you have a specific set of items that are enclosed within each of these pairs, you may be able to write an input statement line

Input a b c @@;

And SAS will read the successive sets of these variables.

Please tell us a little more about what you are trying to accomplish.

Nat Wooding

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Wendi Wright
Sent: Saturday, February 27, 2010 9:05 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Reading an XML parsed file

I need to read in an XML file that comes all in one string (one line). The
current string I am using is currently at length 968,273 and could easily be
longer. We will be receiving these strings from an MQ server on our
mainframe - we are fetching them to the PC and this is how they appear. The
example below (with three records) is only length 3830. I am wondering what
would be the best way to read this in. I want to use only a single data
step (at most) and have one record per item (there may be multiple items per
GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML
Mapper before, is this a good option?

Just imagine the following all on one line:
<GRResolvedRequest ReqID="00036"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0036010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp=" " DelItemID="01104975" ItemParsed=""
ItemValue=" .0000" ItemRWO="O" ItemStatus=" "
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp=" " DelItemID="01105812" ItemParsed="" ItemValue=" .0000"
ItemRWO="O" ItemStatus=" "
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest><GRResolvedRequest
ReqID="00037"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0037010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975"
ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000"
ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest><GRResolvedRequest
ReqID="00038"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0038010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975"
ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000"
ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequest>
CONFIDENTIALITY NOTICE: This electronic message contains
information which may be legally confidential and or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect. The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized. If you are not the intended
recipient, any disclosure, copying, distribution, or use of the
contents of this information is prohibited and may be unlawful. If
you have received this electronic transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it. Thank you.
From: Alan Churchill on
Wendi,

This is an XML fragment. I tossed in a new element at the top (Records) and
pulled it into XMLSpy with no issue. You have 3 records (parent elements)
below and things appear to be well-formed.

For SAS, I think the only feasible choice you have is the XML Mapper.
Supposedly, it is much better in SAS 9.2 though I haven't had a chance to
work with it.

This is a simple exercise in my main language, C#, since it has very robust
XML support. The records would probably break down in 1-3 lines of code, for
example. One option would be to use a language such as C# or Java to handle
the XML portion and pump it into SAS but you may not be comfortable with
that route. Work with the XML Mapper, in that case, if you are using SAS 9.2
or higher.

The bottom line is that you need to convert the hierarchical XML into a
row/column description.

Alan

Alan Churchill
Savian
www.savian.net
Office: (719) 687-5954
Cell: (719) 310-4870


-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Wendi
Wright
Sent: Saturday, February 27, 2010 7:05 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Reading an XML parsed file

I need to read in an XML file that comes all in one string (one line). The
current string I am using is currently at length 968,273 and could easily be
longer. We will be receiving these strings from an MQ server on our
mainframe - we are fetching them to the PC and this is how they appear. The
example below (with three records) is only length 3830. I am wondering what
would be the best way to read this in. I want to use only a single data
step (at most) and have one record per item (there may be multiple items per
GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML
Mapper before, is this a good option?

Just imagine the following all on one line:
<GRResolvedRequest ReqID="00036"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0036010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp=" " DelItemID="01104975" ItemParsed=""
ItemValue=" .0000" ItemRWO="O" ItemStatus=" "
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp=" " DelItemID="01105812" ItemParsed="" ItemValue=" .0000"
ItemRWO="O" ItemStatus=" "
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ
est><GRResolvedRequest
ReqID="00037"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0037010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975"
ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000"
ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ
est><GRResolvedRequest
ReqID="00038"
ReqEndpoint="http://mcsdoas13.mhe.mhc:22411/ESMWebService/GRResolvedService"
FinalRC="0"><ReqGrpLst><ReqGrpDet
ReqGrpID="PCTRITE0038010000000000009"><ItemLst><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1208711" ItemRawRsp="480" DelItemID="01104879"
ItemParsed="480" ItemValue=" 480.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045"
DelElementID="1208713" ItemRawRsp="225" DelItemID="01104975"
ItemParsed="225" ItemValue=" 225.0000" ItemRWO="R" ItemStatus="S"
GRRuleIDs="2,3,4,5,6,7,8,9,10,13"></ItemDet><ItemDet DocComCD="2152701"
TstID="CUSTOM-West Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="044"
DelElementID="1209372" ItemRawRsp="8" DelItemID="01105810" ItemParsed="8"
ItemValue=" 8.0000" ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet><ItemDet DocComCD="2152701" TstID="CUSTOM-West
Virginia" TstFmCD="A" TstLvlCD="13" ItemNum="045" DelElementID="1209374"
ItemRawRsp="15" DelItemID="01105812" ItemParsed="15" ItemValue=" 15.0000"
ItemRWO="R" ItemStatus="1"
GRRuleIDs="2,3"></ItemDet></ItemLst></ReqGrpDet></ReqGrpLst></GRResolvedRequ
est>
From: "Hoyle, Larry" on
There are many tools for reading XML files, but don't sell the SAS XML
Mapper short.

As Alan mentioned, to make your example valid XML you should have
something like

<?xml version="1.0" encoding="UTF-8"?>
<records>
<GRResolvedRequest ReqID="00036" ReqEndpoint= .....
the rest of your example follows

</records>


If you open this XML file in XML Mapper you can choose Tools... Automap
using XML and XML Mapper will create a complete relational structure
from the hierarchical XML structure.


I'll be doing a paper on this at SAS Global Forum, for an early peek see
http://www.ipsr.ku.edu/ksdata/sashttp/SGF2010/



Larry Hoyle
Associate Scientist
Institute for Policy & Social Research, University of Kansas
1541 Lilac Lane, Blake 607
Lawrence, KS 66045-3129

http://www.ipsr.ku.edu

> -----Original Message-----
>
> Date: Mon, 1 Mar 2010 09:25:03 -0700
> From: Alan Churchill <alan.churchill(a)SAVIAN.NET>
> Subject: Re: Reading an XML parsed file
>
> And, as another possibility, you can use C# to work with XML as well.
C#
> has
> a wonderful feature called LINQ which allows you to use SQL-like
> constructs
> for your XML parsing.
>
> XDocument xd = XDocument.Load(@"c:\temp\mydoc.xml");
>
> var recs = (from p in xd.Elements("Element1")
> select new
> {
> myAttr1 = p.Attribute("attribute1").Value,
> myAttr2 = p.Attribute("attribute2").Value
> });
>
>
>
> When run, recs would contain all elements of type Element1 and their 2
> attributes. It can get a lot more complex than the above but it can
reach
> as
> deep into the XML as desired.
>
> It depends on what you are comfortable with and the platform you are
> running
> on.
>
> Alan
>
> Alan Churchill
**********************************
From: Alan Churchill on
Larry,

See you at your presentation. I am curious about the changes in XML Mapper
under 9.2 and have been awaiting it.

Alan

Alan Churchill
Savian
www.savian.net
Office: (719) 687-5954
Cell: (719) 310-4870


-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Hoyle,
Larry
Sent: Monday, March 01, 2010 12:45 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: Reading an XML parsed file

There are many tools for reading XML files, but don't sell the SAS XML
Mapper short.

As Alan mentioned, to make your example valid XML you should have
something like

<?xml version="1.0" encoding="UTF-8"?>
<records>
<GRResolvedRequest ReqID="00036" ReqEndpoint= .....
the rest of your example follows

</records>


If you open this XML file in XML Mapper you can choose Tools... Automap
using XML and XML Mapper will create a complete relational structure
from the hierarchical XML structure.


I'll be doing a paper on this at SAS Global Forum, for an early peek see
http://www.ipsr.ku.edu/ksdata/sashttp/SGF2010/



Larry Hoyle
Associate Scientist
Institute for Policy & Social Research, University of Kansas
1541 Lilac Lane, Blake 607
Lawrence, KS 66045-3129

http://www.ipsr.ku.edu

> -----Original Message-----
>
> Date: Mon, 1 Mar 2010 09:25:03 -0700
> From: Alan Churchill <alan.churchill(a)SAVIAN.NET>
> Subject: Re: Reading an XML parsed file
>
> And, as another possibility, you can use C# to work with XML as well.
C#
> has
> a wonderful feature called LINQ which allows you to use SQL-like
> constructs
> for your XML parsing.
>
> XDocument xd = XDocument.Load(@"c:\temp\mydoc.xml");
>
> var recs = (from p in xd.Elements("Element1")
> select new
> {
> myAttr1 = p.Attribute("attribute1").Value,
> myAttr2 = p.Attribute("attribute2").Value
> });
>
>
>
> When run, recs would contain all elements of type Element1 and their 2
> attributes. It can get a lot more complex than the above but it can
reach
> as
> deep into the XML as desired.
>
> It depends on what you are comfortable with and the platform you are
> running
> on.
>
> Alan
>
> Alan Churchill
**********************************
From: Chang Chung on
On Sat, 27 Feb 2010 09:05:05 -0500, Wendi Wright <wendi_wright(a)CTB.COM>
wrote:

>I need to read in an XML file that comes all in one string (one line).
The
>current string I am using is currently at length 968,273 and could easily
be
>longer. We will be receiving these strings from an MQ server on our
>mainframe - we are fetching them to the PC and this is how they appear.
The
>example below (with three records) is only length 3830. I am wondering
what
>would be the best way to read this in. I want to use only a single data
>step (at most) and have one record per item (there may be multiple items
per
>GRResolvedRequest - see <ItemLst> that repeats. I have not used the XML
>Mapper before, is this a good option?
....

Hi, Wendi:

It will depends on the platform, but at least for windows, this is very
easy.

You can specify a very large number to the lrecl= option of the infile
statement(up to 1G). So why not just read the one line input data and
separate them into shorter lines using delimiters?

You say it is an XML document. If indeed so, then a good candidate for the
line delimiter is the bracket characters ("<" or ">"). This is because a
well-formed xml doc should have already quoted them to "&lt;" and "&gt;"
except in the tags.

Suppose that our input file is wendi.xml, then with appropriate lengths here
and there, something like the following will do the basics. Unquoting and
converting values are left as an exercise to the interested readers.

Below ran on sas 9.2(TS1M0) on windows. HTH.

Cheers,
Chang

%let pwd = %sysfunc(pathname(WORK));
%put pwd=&pwd.;
x cd &pwd.;

filename wendi "wendi.xml";
filename lined "lined.txt";

/* read the input line with a very big buffer and
parse each node out to its own line using dlm=,
assuming that each node is at most 500 chars. */
data _null_;
infile wendi lrecl=1000000 dlm="<>";
file lined lrecl=500;
input c :$500. @@;
put c;
run;

/* read back into a data set. assuming this
regular hierarchical structure remains. */
data wendi;
length ReqId ReqEndpoint FinalRC
ReqGrpID
DocComCD TstID TstFmCD TstLvlCD ItemNum
DelElementID ItemRawRsp DelItemID ItemParsed
ItemValue ItemRWO ItemStatus GRRuleIDs $200;
retain _all_;
infile lined lrecl=500;
drop node;
input node :$50. @;
select(node);
when("GRResolvedRequest")
input (ReqId--FinalRC)(=);
when("ReqGrpDet")
input ReqGrpID=;
when("ItemDet") do;
input (DocComCD--GRRuleIDs) (=);
output;
end;
otherwise;
end;
run;

/* clean up */
filename wendi clear;
filename lined clear;

/* check */
proc print data=wendi;
run;