From: Peter Olcott on 20 May 2010 10:47 On 5/20/2010 1:54 AM, Hector Santos wrote: > Good points P. Delgado. Lets also make a few major notes: > > - He has no products, That have been completed yet. A working prototype has been completed for many years. > - He has no customers, One customer / reseller as soon as OCR4Screen is available > - He has no competition! Wrong I have four competitors that can not match the ballpark of my speed or accuracy. > - He has been at this for past 9-10 years. Since 1998. > - He has the IQ of a Pre-Med student, hence he is smarter than us, I have the IQ of a medical doctor, which means that I am about average for these groups. I am probably not as smart as Joe. I certainly have very much less knowledge than Joe. > - His process once required 5 gb of resident PURE memory, then 3gb > then 1.5gb. > - He wasn't familiarr memory virtualization, fragmentation I have known about fragmentation since the beginning of my career. Joe taught me a few nuances about virtual memory that I was unaware of. Most of this entire issue was that I continued to talk about virtual memory at a higher level of abstraction than the one that Joe was using. > > Remember that classic thread? when he finally proven wrong and he > admitted all his knowledge about an OS comes from a 25 year old OS > class book and he forgot the read the 2nd half because the final exam > was cancelled, hence he didn't know about memory virtualization ideas. > But he was going to catch up now. :) > > - He has no concept of threads even when provided thread based code, This is not true. > - He believes Multiple Queue/Multiple Servant FIFO is superior MQMS is will provide better performance in my case depending upon how this term is applied. I think that the whole issue here is that the term MQMS was incorrectly applied to my case. You pointed this out once, and I tend to agree. My point was that on a single CPU machine that only has one CPU core and no hyper-threading, that adding threads will make each thread run more slowly. > - He invented the fastest string class in the world And the developers of the Microsoft string class were able to match my speed (from what I recall not beat it) by studying my code and implementing it in their code. They also provided good reasons why their original design was so much slower. At least one of these reasons no longer applied at the time that I wrote FastString. > - He wanted to make SQLITE3 behave like MYSQL, SEQUEL I analyzed many alternatives for implementing fast fault tolerant transactions including doing everything from scratch. > - He wanted to use ISAM offset ideas for SQL records I want to make transactions as fast as possible and thus referred back to the fundamental architecture of database technology. > - He wants to do all this in a single cpu computer. From a business point of this this is a good idea. I am growing my own capitol from the ground up. Too many businesses sink far too much money into unproven ventures. Many of these ventures may have otherwise become successful if they only didn't spend so much money so quickly. > - He wants fault tolerants without DISK I/O. I never said this you misunderstood me here. The fundamental basis for all fault tolerance is disk I/O. > > Did I miss anything? I'm pretty sure I did. Oh yeah.. > > - He wants a secured computer at customer sites that no one can touch > because they might still his software. > > Did I mention he has no products? no customers? and no competitor?<g> > > Since 2006, his products would be available in the FALL and will be > available are ActiveX, but oh yeah > > - He wants to use Linux with no GUI and in REAL TIME. > > But the Linux people don't seem to be too helpful and needs to come to > the MFC forum because "this is where people answer his patent claim > questions." Joe is clearly brilliant and you were more helpful than anyone else anywhere else pertaining to the design of the fundamental architecture of my web application. > > Go Figure. > > -- > HLS > > On May 20, 12:13 am, "Pete Delgado"<Peter.Delg...(a)NoSpam.com> wrote: >> "Peter Olcott"<NoS...(a)OCR4Screen.com> wrote in message >> >> news:5O2dnS2UptANt2nWnZ2dnUVZ_rqdnZ2d(a)giganews.com... >> >>> Here are the actual results from the working prototype of my original DFA >>> based glyph recognition engine. >>> http://www.ocr4screen.com/Unique.html >>> The new algorithm is much better than this. >> >> The salient points that you fail to mention is that the alternative >> solutions can perform OCR on *any* font while your implementation requires >> the customer to tell the OCR system which font (including all specifics such >> as point size) is being used. In addition, the other systems can perform >> when the font is not consistent in the document or if different font weights >> are used, your implementation cannot and will fail miserably. >> >> All in all, very misleading. >> >> PS: The information used in my critique of your OCR system was obtained by >> looking at your prior posts as well as your patent and are not merely >> conjecture. >> >> -Pete >
From: Peter Olcott on 20 May 2010 10:56 On 5/20/2010 4:49 AM, Oliver Regenfelder wrote: > Hello, > > Peter Olcott wrote: >> Yes and quite often with zero percent accuracy at screen resolutions. >> The most accurate alternative system scored about 25% accuracy on the >> sample image and was 872-fold slower. > > You're sure it wasn't 872.3-fold slower? > > Best regards, > > Oliver Mine took 0.047 seconds using clock() ** theirs took 41 second +- 1 second using my wrist watch. ** I think that the resolution might be to the millisecond on Intel architecture. I do remember that it used to be to the 1/18 second long ago.
From: Joseph M. Newcomer on 20 May 2010 11:54 One of my standard presentation lines is "What is the difference between a computer scientist, a newbie, and a software engineer?" Sounds like a setup for a joke, but it isn't. Some years ago, I had a project where I calculated the big-O complexity to be O(n^3) (actually it was O(n^2) * O(m), but for most purposes m==n was the expected value). Now, a computer scientist (me) looks at this and says "Wow! O(n^3). Bad! I need to rethink this a design a new algorithm!" So I did, and it required having a pointer in each node that allowed me to thread through the tree. Result: O(n). BIG improvement. A newbie would not know about big-O performance. An engineer (also me), said "Great. But that increases the size of each node, and the pointer validity must be maintained under tree transformations which precede this semantic check, and that's hard." We were running on a very small computer, right at the margins of storage availability, and the project was a couple weeks behind schedule (I was also, at that time, sysadmin for our site, and had just poured days down an administrative rathole, with no end in sight). Maintaining the validity of the pointer (easy to construct during the parse) during the tree transformations (hard) was going to add weeks to the schedule. Not Acceptable. So I decided to see what the values of n and m were. It took under fifteen minutes to add code to compute these values and display them. There were hundreds of calls on this function for processing large grammars. The result: n==m==1 for almost all cases; for a few cases, ==2, for a couple cases, ==3, and with our largest formal grammar, ONE instance of ==4! So O(n^3) doesn't matter when n is very small. That's engineering! So design, even if big-O IS the same, does not prove anything until you know the parameters of the O(f(n)) computation, as realized by ACTUAL DATA. One design where n > 100 can be quite different from another design where n==1, even though O(n) is identifcal for both designs! Clearly, in the above example, I had two designs with quite different f(n) for O(f(n)) but the results were essentially identical if n==1. By analogy, if I had two designs with O(n^3) one could be excellent, and the other a total flaming disaster, if they had values of n==1 and n > 100. So while big-O is an important concept, it must be applied with judgment. Also, remember that O(f(n)) means the real equation is k + C * f(n) + t where k is the setup time for the computation, C is the constant of proportionality, and t is the teardown time for the computation (often t==0 so we ignore it). In some cases, k and C dominate performance, e.g., the string-compare example I've cited many times before, where I found I was spending all my time in the equivalent of strcmp (C was HUGE), and when I reduced C to 1 clock cycle, I got excellent performance of an f(n) = n log2 n algorithm. I have a multithreading example where k dominates for small values of n > 1 (the curve of performance is interesting; it goes UP for a while as the number of threads increases, then goes DOWN until the number of threads == number of CPU cores, then starts slowly going UP again as the number of threads increased beyond the number of cores) [Note that algorithm is not to be confused with "Think Green! Think Green! Think Green!", which is the Al Gore Rhythm] joe joe On Thu, 20 May 2010 00:03:30 -0700, "Mihai N." <nmihai_year_2000(a)yahoo.com> wrote: > >> Design never proves anything with regards to speed, at least as >> long as the big-O is the same. > >Not to mention that big-O tells you something only for relatively >big values of 'n' (how big is 'big' depends on the algorithm). Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 20 May 2010 12:12 See below... On Wed, 19 May 2010 23:54:00 -0700, "Mihai N." <nmihai_year_2000(a)yahoo.com> wrote: > >> If your compiler defines wchar_t as 16 bits, then >> it implies UTF-16 encoding > >Nope. wchar_t does not imply Unicode. **** True. wchar_t is actually a multibyte encoding which can encode according to arbitrary implementation-specified standards. I stand corrected. The Microsoft C compiler, however, will interpreter wchar_t literals as Unicode. And since the discussion was about UTF encodings, whcar_t implies Unicode's UTF-16. That's where the confusion was; the statement was made in the context of a Unicode encoding discussion. But you're right; in the formal defiinition of wchar_t, all it claims is that this is an implementation-defined width. **** >I think this is caused by the great reluctance of the C/C++ standards >to refer to other standards. They try to be self-sufficient. **** Which is probably a Good Thing. ****. > >Happily enough, this seems to be changing lately (still too slow). **** Note that they do refer to ISO 10646 (see the footnote on page 19 of the draft standard (30) which has already been cited in this thread). **** > > >> Well, the locale names are supposed to be the ISO standard >> string designators > >From what I know, that is not specified anywhere in the C/C++ standard. >A locale can be anything you want it to be. >POSIX added something, but it is quite outdated. **** Perhaps I'm remembering some other proposal; I just checked the C standard, and the only locale actually supported by Standard C is "C". All other names are unspecified. So it is up to the implementor to decide what is going on, thus impacting portability. **** > >UTS-35 (Unicode Technical Standard #35, http://unicode.org/reports/tr35/) >is the best thing right now. And you can use it with ICU (again the best >platform-independent solution for locale aware support (ICU has it's own >problems though)) **** The key document is referenced as http://www.rfc-editor.org/rfc/bcp/bcp47.txt which is actually RFC5646. This is a lengthy document but worth reading. This is a case where the C language should reference other standards. joe **** Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: James Kanze on 20 May 2010 13:52
On May 19, 6:45 pm, Peter Olcott <NoS...(a)OCR4Screen.com> wrote: > On 5/19/2010 12:23 PM, James Kanze wrote: [...] > > And how do you act on an ActionCode? Switch statements and > > indirect jumps are very, very slow on some machines (but not on > > others). > I could not imagine why they would ever be very slow. I know > that they are much slower than an ordinary jump because of the > infrastructure overhead. This is only about one order of > magnitude or less. On some machines (HP PA architecture, for example), any indirect jump (at the assembler level) will purge the pipeline, resulting in a considerable slowdown. And the classical implementation of a dense switch uses a jump table, i.e. an indirect jump. (The alternative involves a number of jumps, so may not be that fast either.) This is not universal, of course---I've not noticed it on a Sparc, for example. -- James Kanze |