From: Mike Fowler on 7 Jul 2010 11:37 Peter Eisentraut wrote: > On l�r, 2010-07-03 at 09:26 +0100, Mike Fowler wrote: > >> What I will do >> instead is implement the xml_is_well_formed function and get a patch >> out in the next day or two. >> > > That sounds very useful. > Here's the patch to add the 'xml_is_well_formed' function. Paraphrasing the SGML the syntax is: |xml_is_well_formed|(/text/) The function |xml_is_well_formed| evaluates whether the /text/ is well formed XML content, returning a boolean. I've done some tests (included in the patch) with tables containing a mixture of well formed documents and content and the function is happily returning the expected result. Combining with IS (NOT) DOCUMENT is working nicely for pulling out content or documents from a table of text. Unless I missed something in the original correspondence, I think this patch will solve the issue. Regards, -- Mike Fowler Registered Linux user: 379787
From: Mike Fowler on 12 Jul 2010 08:07 Thom Brown wrote: > Would a test for mismatched or undefined namespaces be necessary? > > For example: > > Mismatched namespace: > <pg:foo xmlns:pg="http://postgresql.org/stuff">bar</my:foo> > > Undefined namespace when used in conjunction with IS DOCUMENT: > <pg:foo xmlns:my="http://postgresql.org/stuff">bar</pg:foo> > Thanks for looking at my patch Thom. I hadn't thought of that particular scenario and even though I didn't specifically code for it, the underlying libxml call does correctly reject the mismatched namespace: template1=# SELECT xml_is_well_formed('<pg:foo xmlns:pg="http://postgresql.org/stuff">bar</my:foo>'); xml_is_well_formed -------------------- f (1 row) In the attached patch I've added the example to the SGML documentation and the regression tests. > Also, having a look at the following example from the patch: > SELECT xml_is_well_formed('<local:data > xmlns:local="http://127.0.0.1";><local:piece id="1">number > one</local:piece><local:piece id="2" /></local:data>'); > xml_is_well_formed > -------------------- > t > (1 row) > > Just wondering about that semi-colon after the namespace definition. > > Thom > The semi-colon is not supposed to be there, and I'm not sure where it's come from. With Thunderbird I see the email with my patch as an attachement, downloaded and viewing the file there are no instances of a " followed by a ;. However, if I look at the message on the archive at http://archives.postgresql.org/message-id/4C3871C2.8000605(a)mlfowler.com I can see every URL that ends with a " has a ; following it. Should I be escaping the " in the patch file in some way or this just an artifact of HTML parsing a patch? Regards, -- Mike Fowler Registered Linux user: 379787
|
Pages: 1 Prev: How MTV pulls off shock and draw Next: keepalive in libpq using |