Prev: Implicit object constructor misinterpretation
Next: Creating an Object that extends Array functionality
From: Dr J R Stockton on 31 Oct 2009 19:24 In comp.lang.javascript message <53e60b5e-ddec-4b97-aa14-64c31f883159(a)j1 9g2000yqk.googlegroups.com>, Sat, 31 Oct 2009 10:08:13, VK <schools_ring(a)yahoo.com> posted: >Dr J R Stockton wrote: >> >> >> In effect, I want to read the file, HTML or TXT, as it exists on disc. > >VK wrote: >> >> >You cannot do it for the reason explained at >> >> >http://groups.google.com/group/comp.lang.javascript/msg/d9f3f6724bada573 > >Dr J R Stockton wrote: >> >> Unconvincing, because I *am* doing it, > >VK wrote: >> >You don't, it is your delusion. >> >I don't know how and why are you doing it, but it was stated that "I >> >want to read the file, HTML or TXT, as it exists on disc." As long as >> >you are not using AJAX calls - and you don't - you are not able and >> >you are not reading any files "HTML or TXT, as it exists on disc" - >> >however wide the definition "as it exists on disc" would be taken. > >Dr J R Stockton wrote: >> Give or take irrelevant questions of character coding and newline >> representation, I have been getting, by using innerHTML and by using >> innerText, a string which agrees visually with the content of a TXT file >> on disc, as would be shown by Notepad. > >It should be expected in many (but not all) situations. >Contrary to the popular believe, browsers are *not* able to open text >or graphics files. What they are able to - as part of their extended >functionality - is to recognize some file types other than HTML and to >wrap them on the fly into predefined HTML templates so to display them >in the browser window. In the particular for text/plain files they are >using template > <HTML> > <HEAD></HEAD> > <BODY> > <PRE> text file content goes here </PRE> > </BODY> > </HTML> >with the exact tags' case (upper or lower) being browser dependent. I wrote "getting, ..., a string", not "saw in a window". Fram being a reference to an iframe recently loaded from a simple *.txt file, the code DIR = Fram.contentDocument.body DIR = DIR.textContent || DIR.innerText // is latter needed? Yes, IE8 alert(DIR) // for VK directly shows in the alert window plain text, not preceded by anything using angle-brackets, for MS IE 8, Firefox 3.0.15, Opera 10.01, Safari 4.0.3, and Chrome 3.0. The <localhost> shown by Opera, and the JavaScript shown by Safari, are parts of the alerts, not of their contents. >This way the text you "see" is in effect the content of a single <pre> >element necessarily altered from the "as it is on disc" to be placed >into this tag. For instance all less-than and greater-than signs will >be converted to the corresponding named HTML entities. The fact that >you were getting so far "by using innerHTML ..., a string which agrees >visually with the content of a TXT file" suggests that so far you were >lucky but not having any problematic characters in your .txt files, "getting by using innerHTML" is not the same as "getting directly as innerHTML". IIRC, most browsers wrapped with <pre> and one put rather more at the top. When I was using innerHTML, I easily removed those by RegExp. >> Associated query : I have read a TXT file from disc, getting a matching >> string. �It consists of many lines containing words separated by >> punctuation. �They all start with the same sequence of words and >> punctuation (improbably, zero length), but after that there is always >> non-zero length. �No two lines completely agree. �What is the nicest way >> of determining the common part AND obtaining in sequence strings for the >> varying parts? �Think of it as like a representation of a directory >> tree. > >This is OT to the discussed FAQ topic but an interesting problem per >se. I am thinking to move it into separate thread or you may do it >yourself. I have a rather close request for ggNoSpam, in order to give >users an ability to adjust the regexp spam filter even with zero >knowledge of regular expressions. The abstract task description would >be: >"Given an array of strings with the minimum 2 and the maximum 1o >elements, find the shortest common word in these strings. If no such >common character sequence found, then try to find the biggest subset >of strings having a common word". > >"word" is understood in regexp terms. To avoid "rush answers" with >common words like "a" or "the" articles let's define that the shortest >common word must be no shorter than 4 characters. I've changed my mind about whether, for the present, I want to do that. It would certainly increase efficiency, though perhaps not noticeably. But doing that and the changes which would necessarily be associated with it would be an impediment to extending capability in a direction which may be possible and useful. If such a thread is started, I'll participate, if anything worth writing occurs to me. var AoS = ["aaa bbb ccc ddd ccc bbb ddd", "bbb zzz ggg", "banana"] var J, A, K, T, Obj = {}, Z = 0 J = AoS.length while (J--) { T = {} // for each string A = AoS[J].split(/\W+/) // make array of words K = A.length ; while (K--) T[A[K]] = 1 // no internal dupes for (K in T) Obj[K] ? Obj[K]++ : Obj[K] = 1 // ... // ... if entry exists, increment, else create entry value 1 } for (K in Obj) if (Obj[K]>Z) { Z = Obj[K] ; Word = [K, Z] } Then Word[0] appears in Word[1] of the strings, and no word appears in more than Word[1] of them. The last line can be amended by making the test >= and complicating what follows, to list all words of the highest popularity and not just the first one found. By using another variable VERY SLIGHTLY TESTED (uses technique of LINXCHEK). -- (c) John Stockton, nr London UK. ?@merlyn.demon.co.uk DOS 3.3, 6.20; WinXP. Web <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links. PAS EXE TXT ZIP via <URL:http://www.merlyn.demon.co.uk/programs/00index.htm> My DOS <URL:http://www.merlyn.demon.co.uk/batfiles.htm> - also batprogs.htm.
From: Garrett Smith on 31 Oct 2009 23:42 Dr J R Stockton wrote: > In comp.lang.javascript message <hcdshu$ds2$1(a)news.eternal- > september.org>, Thu, 29 Oct 2009 22:11:19, Garrett Smith > <dhtmlkitchen(a)gmail.com> posted: >> As long as I have something to say about it, the entry will correctly >> explain how to access the window object of the IFRAME. >> > > You are supposed, as FAQ maintainer, to be sustaining something useful > to the ordinary questioners, especially those who are not full-time > professional JavaScript programmers. > > However, you appear entirely unable to understand their positions and > points of view. FAQ maintaining is a task for the sympathetic > communicator; not for the nerd. > A lot of the complaints with the FAQ is too verbose, too long. The FAQ should not be too much of a chore to read. It should be easy to understand. Once the document is found, the next step is to do something with that, right? That is what DOM and Forms section is for. Things about the document seem more appropriate for "DOM and Forms", not "window and frames". Perhaps worth mentioning:- | The frame must be fully loaded before its content can be accessed. | | fwin.document;// the document | fwin.document.documentElement; // root element. | | See the section on DOM and Forms: #domRef Perhaps worth another entry:- How can I know when an iframe has loaded? First, I'd rather edit/shorten the entry on #getWindowSize. It is a long entry and explains a workaround for older versions of Opera. It seems worth removing that workaround and its explanation. That should shorten the entry considerably. Less is more, here. -- Garrett comp.lang.javascript FAQ: http://jibbering.com/faq/
From: Garrett Smith on 1 Nov 2009 02:36 Dr J R Stockton wrote: > In comp.lang.javascript message <53e60b5e-ddec-4b97-aa14-64c31f883159(a)j1 > 9g2000yqk.googlegroups.com>, Sat, 31 Oct 2009 10:08:13, VK > <schools_ring(a)yahoo.com> posted: >> Dr J R Stockton wrote: >>>>>>> In effect, I want to read the file, HTML or TXT, as it exists on disc. >> VK wrote: >>>>>> You cannot do it for the reason explained at >>>>>> http://groups.google.com/group/comp.lang.javascript/msg/d9f3f6724bada573 >> Dr J R Stockton wrote: >>>>> Unconvincing, because I *am* doing it, >> VK wrote: [a lot of context] > >> This way the text you "see" is in effect the content of a single <pre> >> element necessarily altered from the "as it is on disc" to be placed >> into this tag. For instance all less-than and greater-than signs will >> be converted to the corresponding named HTML entities. The fact that >> you were getting so far "by using innerHTML ..., a string which agrees >> visually with the content of a TXT file" suggests that so far you were >> lucky but not having any problematic characters in your .txt files, > > "getting by using innerHTML" is not the same as "getting directly as > innerHTML". IIRC, most browsers wrapped with <pre> and one put rather > more at the top. When I was using innerHTML, I easily removed those by > RegExp. > > You might try reading from the PRE element: var fdoc = frames[0].document; var pre = fdoc.getElementsByTagName("pre")[0]; var htmlString = pre.innerHTML; var textString = (typeof pre.textContent == "string" ? pre.textContent : pre.innerText); -- Garrett comp.lang.javascript FAQ: http://jibbering.com/faq/
From: Dr J R Stockton on 1 Nov 2009 16:17 In comp.lang.javascript message <hcjdqb$joa$1(a)news.eternal- september.org>, Sat, 31 Oct 2009 23:36:41, Garrett Smith <dhtmlkitchen(a)gmail.com> posted: >You might try reading from the PRE element: > >var fdoc = frames[0].document; >var pre = fdoc.getElementsByTagName("pre")[0]; >var htmlString = pre.innerHTML; >var textString = (typeof pre.textContent == "string" ? > pre.textContent : pre.innerText); As regards what I wanted to do, success is a matter of history. Fram is the frame : DIR = Fram.contentDocument.body DIR = DIR.textContent || DIR.innerText // is latter needed? Yes, IE8 gives the content of the disc file; viewing alert(DIR) shows an exact match to viewing the file in Notepad. That alert, commented out and annotated VK, is in <URL:http://www.merlyn.demon.co.uk/linxchek.htm>. OTOH, I cannot say what happens with non-Windows (indeed, with non- XPsp3) systems - UNIX or Mac, for example ; and it would be helpful to be able to give the necessary UNIX or Mac input to generate a suitable file. Anyone like to try it in Mac or UNIX. -- (c) John Stockton, nr London UK. ?@merlyn.demon.co.uk BP7, Delphi 3 & 2006. <URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/&c., FAQqy topics & links; <URL:http://www.bancoems.com/CompLangPascalDelphiMisc-MiniFAQ.htm> clpdmFAQ; NOT <URL:http://support.codegear.com/newsgroups/>: news:borland.* Guidelines
From: Dr J R Stockton on 1 Nov 2009 17:00
In comp.lang.javascript message <hcj02m$62n$1(a)news.eternal- september.org>, Sat, 31 Oct 2009 19:42:12, Garrett Smith <dhtmlkitchen(a)gmail.com> posted: >Dr J R Stockton wrote: >> In comp.lang.javascript message <hcdshu$ds2$1(a)news.eternal- >> september.org>, Thu, 29 Oct 2009 22:11:19, Garrett Smith >> <dhtmlkitchen(a)gmail.com> posted: >>> As long as I have something to say about it, the entry will correctly >>> explain how to access the window object of the IFRAME. >>> >> You are supposed, as FAQ maintainer, to be sustaining something >>useful >> to the ordinary questioners, especially those who are not full-time >> professional JavaScript programmers. >> However, you appear entirely unable to understand their positions >>and >> points of view. FAQ maintaining is a task for the sympathetic >> communicator; not for the nerd. >> > >A lot of the complaints with the FAQ is too verbose, too long. > >The FAQ should not be too much of a chore to read. It should be easy >to understand. > >Once the document is found, the next step is to do something with that, >right? That is what DOM and Forms section is for. > >Things about the document seem more appropriate for "DOM and Forms", not >"window and frames". For the FAQ to be useful to its intended readership, its Subjects (as seen at the beginning) must be structured ENTIRELY from the point of view of the questioner, without any consideration of the structure of the answers. Otherwise, you're writing a nerdy document much like the majority of the big Flamingo book. You MUST learn how the ordinary FAQ reader will think, when seeking an answer. >Perhaps worth mentioning:- >| The frame must be fully loaded before its content can be accessed. I am incompletely convinced of that. When reading by timeout, I thought I saw signs of gaining access to an only partially-filled links array. Certainly they might have been deceiving signs; but the point should be checked in several actual browsers. If your sentence were strictly true, one could issue a frame load directly followed by a frame read, and the system would wait until loaded before executing the read. But it would be safe to write :- Access to frame content should not be attempted before the frame is fully loaded. >How can I know when an iframe has loaded? Yes. I have success with the Fram.onLoad event in FF Op Sf Cr, but not IE, where I seem to need to use a timeout. OTOH I suspect IE8 of lying to me, or of being confused [*]. Flamingo wrote of a readyState property, which might be pollable. [*] I get reports of many unused anchors in a file not containing those anchors, in IE. -- (c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME. Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links. Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036) Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036) |