Prev: interface name file
Next: Setting up a XDebug debugging environment for PHP / WAMP / Eclipse PDT
From: Marc Guay on 8 Jul 2010 11:06 > I wonder what the difference is between doing "new > SimpleXMLElement" and calling simplexml_load_string which results in the > libxml_use_internal_errors call being ineffective. Odd. The documentation for "Dealing with XML errors" only mentions simplexml_load_string() and this comment http://ca3.php.net/manual/en/simplexml.examples-basic.php#93263 shows that you're not the first person to run into this. Marc
From: "Gary ." on 8 Jul 2010 11:15 Okay. At least one of the problems with this so called HTML seems to be that the body tag looks like <BODY vlink=#ffffff ...> and xml_parse complains that "> required" on that line (i.e. it is claiming it can't find the end of the tag!). I'm guessing that those attributes "must" be quoted in XML and "should" be in HTML (but patently aren't)? Is there any way to get xml_parse to ignore that? My element_handler functions never even get a chance to see that line. Regex to insert quotes or remove the attributes entirely, perhaps? *gulp* I hope there's a better way than that.
From: Richard Quadling on 8 Jul 2010 11:44 On 8 July 2010 16:15, Gary . <php-general(a)garydjones.name> wrote: > Okay. At least one of the problems with this so called HTML seems to > be that the body tag looks like > <BODY vlink=#ffffff ...> > and xml_parse complains that "> required" on that line (i.e. it is > claiming it can't find the end of the tag!). > > I'm guessing that those attributes "must" be quoted in XML and > "should" be in HTML (but patently aren't)? Is there any way to get > xml_parse to ignore that? My element_handler functions never even get > a chance to see that line. > > Regex to insert quotes or remove the attributes entirely, perhaps? > *gulp* I hope there's a better way than that. So. Essentially, you want to parse some plain text which may or may not be well formed XML. In short ... good luck. How badly formed is the file going to be? If it is things like missing ", then this could be managed with regex. Essentially you are going to have to do the clean up that Tidy could do for you.
From: Marc Guay on 8 Jul 2010 11:56 > And yes, I'd rather use DOM, but I can't. Could you use this: http://simplehtmldom.sourceforge.net/?
From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on 8 Jul 2010 12:50 On Thu, 8 Jul 2010 17:15:02 +0200, "Gary ." wrote: > Okay. At least one of the problems with this so called HTML seems to > be that the body tag looks like > <BODY vlink=#ffffff ...> > and xml_parse complains that "> required" on that line (i.e. it is > claiming it can't find the end of the tag!). > > I'm guessing that those attributes "must" be quoted in XML and > "should" be in HTML (but patently aren't)? For that attribute value, it's a "must" in both cases. And for strict versions of (X)HTML, the attribute does not exist at all. /Nisse
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: interface name file Next: Setting up a XDebug debugging environment for PHP / WAMP / Eclipse PDT |