From: Eric Bednarz on 20 Feb 2010 19:08 David Mark <dmark.cinsoft(a)gmail.com> writes: > Stevo wrote: >> David Mark wrote: >>> Stevo wrote: >>>> document.write("<h1>" + x + "</h1>"); >>> >>> But if it is inline script, you need to escape the slash so the second >>> string value isn't mistaken for a closing H1 tag. >>> >>> document.write("<h1>" + x + "<\/h1>"); >> >> But it *is* a closing H1 tag ;-) No mistaking about it. > > No it most assuredly is not a closing H1 tag. In SGMLese, which is what you seem to be worried about, it most assuredly is (unless you just take issue with the nomenclature), otherwise there would be no need to hide ETAGO. minimal example document: | <!DOCTYPE script PUBLIC "-//W3C//DTD HTML 4.01//EN"> | <script type='text/javascript'> | document.write("<h1>" + x + "</h1>"); | </script> parsing: | -*- mode: compilation; default-directory: "~/Sites/sandbox/html/" -*- | Compilation started at Sun Feb 21 00:56:29 | | onsgmls -c /Volumes/Data/Emacs/SGML/catalog -s etago.html | onsgmls:etago.html:3:33:E: end tag for element "H1" which is not open ^^^^^^^^^^^^^^^^^^^^^^^^ There you go.
From: David Mark on 20 Feb 2010 19:49 Eric Bednarz wrote: > David Mark <dmark.cinsoft(a)gmail.com> writes: > >> Stevo wrote: >>> David Mark wrote: >>>> Stevo wrote: >>>>> document.write("<h1>" + x + "</h1>"); >>>> But if it is inline script, you need to escape the slash so the second >>>> string value isn't mistaken for a closing H1 tag. >>>> >>>> document.write("<h1>" + x + "<\/h1>"); >>> But it *is* a closing H1 tag ;-) No mistaking about it. >> No it most assuredly is not a closing H1 tag. > > In SGMLese, which is what you seem to be worried about, it most > assuredly is (unless you just take issue with the nomenclature), > otherwise there would be no need to hide ETAGO. But that's my point. It will be _mistaken_ for a "real" closing tag (as the validation services point out. > > minimal example document: > > | <!DOCTYPE script PUBLIC "-//W3C//DTD HTML 4.01//EN"> > | <script type='text/javascript'> > | document.write("<h1>" + x + "</h1>"); > | </script> > > parsing: > > | -*- mode: compilation; default-directory: "~/Sites/sandbox/html/" -*- > | Compilation started at Sun Feb 21 00:56:29 > | > | onsgmls -c /Volumes/Data/Emacs/SGML/catalog -s etago.html > | onsgmls:etago.html:3:33:E: end tag for element "H1" which is not open > ^^^^^^^^^^^^^^^^^^^^^^^^ > > There you go. > That's what I was saying. It's no good. Even if (most) browsers will deal with it, I don't like assuming things about browsers and I like to keep the validation clean so the more important issues are not obscured.
From: kangax on 20 Feb 2010 19:56 On 2/20/10 7:06 PM, Thomas 'PointedEars' Lahn wrote: > kangax wrote: > >> On 2/20/10 4:49 PM, David Mark wrote: >>> But if it is inline script, you need to escape the slash so the second >>> string value isn't mistaken for a closing H1 tag. >>> >>> document.write("<h1>" + x +"<\/h1>"); >> >> Yeah, one of those things that standard and de-facto standard disagree >> on. > > How did you get the idea that invalid markup would be a de-facto standard? The same way anything else would be considered a de-facto standard. > A million flies can't be wrong? > >> I've tested the whole slew of browsers—ancient, mobile, desktop, >> etc.—and none would close the script tag on discovery of ETAGO. > > You need to test more, and refine your tests. The issue is known to occur > with "<script ...>...</script>" in particular, but it has been observed on > other occasions as well. <script>...</script> is exactly what I've been testing. What more is there to test if the purpose was to check if SCRIPT content is parsed properly? And which other occasions are you talking about? I haven't seen a client that respects HTML 4.01 in this regard and closes SCRIPT element on first occurrence of "</". Have you? [...] -- kangax
From: Thomas 'PointedEars' Lahn on 20 Feb 2010 21:33 kangax wrote: > Thomas 'PointedEars' Lahn wrote: >> kangax wrote: >>> David Mark wrote: >>>> But if it is inline script, you need to escape the slash so the second >>>> string value isn't mistaken for a closing H1 tag. >>>> >>>> document.write("<h1>" + x +"<\/h1>"); >>> Yeah, one of those things that standard and de-facto standard disagree >>> on. >> How did you get the idea that invalid markup would be a de-facto >> standard? > > The same way anything else would be considered a de-facto standard. Then you are mistaken, because a de facto standard is something that is not (yet) standardized, although regarded so common *and useful* that it is widely accepted by the public, in particular by the professional community that it concerns. ("De facto" being Latin for "concerning the fact" or "in practice".) A synonym is "best current practice" (BCP). But invalid markup does _not_ appear to be widely accepted, nor does it appear to be considered best current practice. In fact, there is the strong recommendation to use Valid markup even though there is built-in error correction (because while that feature has some informal recommendations regarding it, there are no must-haves, and therefore it cannot be relied on). Do we not see this confirmed every time someone reports problems with a Web site using invalid markup, and is being told by several rather knowledgable people to fix their markup first as the problem is likely going to go away then? >> A million flies can't be wrong? You have evaded that part of the question as well. A great number of amateurs misusing the feature of built-in error correction, most of the time without knowing it, does not make their doing any more a de facto standard than any other of their mistakes. >>> I've tested the whole slew of browsers—ancient, mobile, desktop, >>> etc.—and none would close the script tag on discovery of ETAGO. >> You need to test more, and refine your tests. The issue is known to >> occur with "<script ...>...</script>" in particular, but it has been >> observed on other occasions as well. > > <script>...</script> is exactly what I've been testing. What more is > there to test if the purpose was to check if SCRIPT content is parsed > properly? The SCRIPT element with a `type' attribute, and perhaps "nested" SCRIPT elements. > And which other occasions are you talking about? I haven't seen a client > that respects HTML 4.01 in this regard and closes SCRIPT element on > first occurrence of "</". Have you? I am sure that the W3C Validator does, IOW not fixing this error makes further validation of the document using this tool next to impossible. I do not remember which browsers did this, but there must have been at least one popular one among them or it would not have become such an issue in the first place. Probably the list would include W3C Amaya. Lynx, which is sometimes used by server administrators and as input for screenreaders, is at least known to report invalid markup visibly (in the status line), which would not look too good. Very likely further information can be found in the archives in postings containing the term "ETAGO" or "End Tag Open delimiter". By the way, that reminds me of a similar misconception I had found on your Web site that I did not find time to mail you about yet (so I am doing it here and now, lest I forget again): You stated there something along the lines that it would not matter that in XHTML the content of `script' elements was not, where necessary, properly escaped or declared CDATA, because the Content-Type `text/html' would not trigger an X(HT)ML parser anyway. However, first of all you cannot know for sure which parser is being used, and second it matters for the W3C Validator and any other markup validator because they MUST NOT care for the Content-Type of the markup resource with regard to syntax except for the `charset' parameter. IOW, the markup is still _not_ Valid then. So by _not_ using Valid markup there, you are shooting yourself in the foot there, too. HTH PointedEars -- Danny Goodman's books are out of date and teach practices that are positively harmful for cross-browser scripting. -- Richard Cornford, cljs, <cife6q$253$1$8300dec7(a)news.demon.co.uk> (2004)
From: kangax on 21 Feb 2010 10:39
On 2/20/10 9:33 PM, Thomas 'PointedEars' Lahn wrote: > kangax wrote: > >> Thomas 'PointedEars' Lahn wrote: >>> kangax wrote: >>>> David Mark wrote: >>>>> But if it is inline script, you need to escape the slash so the second >>>>> string value isn't mistaken for a closing H1 tag. >>>>> >>>>> document.write("<h1>" + x +"<\/h1>"); >>>> Yeah, one of those things that standard and de-facto standard disagree >>>> on. >>> How did you get the idea that invalid markup would be a de-facto >>> standard? >> >> The same way anything else would be considered a de-facto standard. > > Then you are mistaken, because a de facto standard is something that is not > (yet) standardized, although regarded so common *and useful* that it is > widely accepted by the public, in particular by the professional community > that it concerns. ("De facto" being Latin for "concerning the fact" or "in > practice".) A synonym is "best current practice" (BCP). > > But invalid markup does _not_ appear to be widely accepted, nor does it > appear to be considered best current practice. In fact, there is the > strong recommendation to use Valid markup even though there is built-in > error correction (because while that feature has some informal > recommendations regarding it, there are no must-haves, and therefore it > cannot be relied on). > > Do we not see this confirmed every time someone reports problems with a Web > site using invalid markup, and is being told by several rather knowledgable > people to fix their markup first as the problem is likely going to go away > then? Sorry, I must have expressed myself poorly. De-facto standard I'm talking about is the way *browsers* treat contents of SCRIPT elements. I am certainly not talking about invalid mark-up here, neither am I suggesting that using it is the best practice. > >>> A million flies can't be wrong? > > You have evaded that part of the question as well. [...] Wasn't that a rhetorical question? ;) > amateurs misusing the feature of built-in error correction, most of the > time without knowing it, does not make their doing any more a de facto > standard than any other of their mistakes. Absolutely not. See above. > >>>> I've tested the whole slew of browsers—ancient, mobile, desktop, >>>> etc.—and none would close the script tag on discovery of ETAGO. >>> You need to test more, and refine your tests. The issue is known to >>> occur with "<script ...>...</script>" in particular, but it has been >>> observed on other occasions as well. >> >> <script>...</script> is exactly what I've been testing. What more is >> there to test if the purpose was to check if SCRIPT content is parsed >> properly? > > The SCRIPT element with a `type' attribute, and perhaps "nested" SCRIPT > elements. SCRIPT element did have a type attribute. What's the point of testing nested SCRIPT element if we are interested only in "</" part? To eliminate any confusion, the relevant part of the test was: .... <script type="text/javascript"> document.write('<textarea>test</textarea>'); </script> .... <http://kangax.github.com/jstests/etago_delimiter_test/> None of the browsers I tested would close SCRIPT element on "</". > >> And which other occasions are you talking about? I haven't seen a client >> that respects HTML 4.01 in this regard and closes SCRIPT element on >> first occurrence of "</". Have you? > > I am sure that the W3C Validator does, IOW not fixing this error makes > further validation of the document using this tool next to impossible. Of course. > > I do not remember which browsers did this, but there must have been at > least one popular one among them or it would not have become such an issue > in the first place. Probably the list would include W3C Amaya. Lynx, Hmm, I see that Lynx (2.8.7 on Mac OS X) doesn't follow standard either. "</" doesn't terminate SCRIPT element. Surprisingly, Amaya, with its draconian rules, ignores '</' as well. When inspecting DOM tree, I see that SCRIPT element has the entire "document.write('<textarea>test</textarea>')" string. > which is sometimes used by server administrators and as input for > screenreaders, is at least known to report invalid markup visibly (in the > status line), which would not look too good. Very likely further > information can be found in the archives in postings containing the term > "ETAGO" or "End Tag Open delimiter". Thanks. I'll take a look when I have time. c.i.w.a.html might have something in archives too. > > By the way, that reminds me of a similar misconception I had found on your > Web site that I did not find time to mail you about yet (so I am doing it > here and now, lest I forget again): You stated there something along the > lines that it would not matter that in XHTML the content of `script' > elements was not, where necessary, properly escaped or declared CDATA, > because the Content-Type `text/html' would not trigger an X(HT)ML parser > anyway. However, first of all you cannot know for sure which parser is > being used [...], FTA: "Unless you're serving documents as “application/xhtml+xml” [...]" and second it matters for the W3C Validator and any other > markup validator because they MUST NOT care for the Content-Type of the > markup resource with regard to syntax except for the `charset' parameter. > IOW, the markup is still _not_ Valid then. So by _not_ using Valid markup > there, you are shooting yourself in the foot there, too. This is true. I should have mentioned validation. But serving XHTML-like tag soup as HTML, with CDATA sections (present to satisfy validation only) is already shooting yourself in the foot, and is rather wasteful [1]. You're validating document as XHTML, and browser ends up parsing it as HTML. [1] Content model of SCRIPT in HTML 4.01 is CDATA (with few additional rules, such as aforementioned "</" termination), not PCDATA as it is in, say, XHTML 1.1. -- kangax |