From: Peter Olcott on 29 Mar 2010 22:39 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:gbs1r5tr89f31ut0jvnovu8nvu2i7qpaph(a)4ax.com... > See below... > On Mon, 29 Mar 2010 10:58:56 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>> Why do you suddenly add "fault tolerance", then add >>> "fast >>> interprocess communication that >>> is also fault tolerant"? I think you are totally and >>> utterly clueless here. You do not >>> understand ANY of the concepts involved. >>> >>> If you have fault tolerance, why does it have to be in >>> the >>> IPC mechanism? In fact, it >>> would probably be a Really Bad Idea to try that. >> >>Although it may be temporary ignorance on my part that >>suggested such a thing, I was thinking that it might be >>simpler to do it this way because every client request >>will >>be associated with a financial transaction. Each financial >>transaction depends upon its corresponding client request >>and each client request depends upon its corresponding >>financial transaction. With such mutual dependency it only >>seemed natural for the underlying representation to be >>singular. > **** > There are several possible approaches here, with different > tradeoffs. > (a) get PayPal acknowledgement before starting the > transaction > (b) Because the amount is so small, extend credit, and do > the PayPal processing "offline" > out of the main processsing thread; in fact, don't even > request the PayPal debiting until > the transaction has completed, and if it is refused, put > the PayPal processing FIRST for > their next request (thus penalizing those who have had a > refused transaction); you might > lost a dime here and there, but you have high performance > for other than those who aren't > cleared. > (c) if the transaction fails, and you have already debited > the account, have a background > process credit the account for the failed transaction. None of the above, although most like (b) They pay me in advance in at least $1 increments and this amount is placed in their local server account file. The real time transaction goes against this local file. > > You are confusing IPC with robustness mechanism. IPC is a > pure and simply a transport > mechanism; anything about robustness has to be implemented > external to the IPC. > joe > >> >>> I guess I'm biased again, having built >>> several fault-tolerant systems. >>> joe >>> **** >>>> >>> Joseph M. Newcomer [MVP] >>> email: newcomer(a)flounder.com >>> Web: http://www.flounder.com >>> MVP Tips: http://www.flounder.com/mvp_tips.htm >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 29 Mar 2010 22:38 See below... On Mon, 29 Mar 2010 16:35:54 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >It is not that I am stupid it is that for some reason (I am >thinking intentionally) you fail to get what I am saying. >I needed something the PREVENTS page faults**, MMF does not >do that, VirtualLock() does. **** I never promised that memory mapped files would give you ZERO page faults; I only pointed out that it can reduce the total number of page faults, and distribute the cost of them differently than your simplistic model that takes several thousand page faults to load the data. And I said that in any real world experiment, it is essential to gather the data to show the exact impact of the architecture. **** > >A thing can ONLY be said to be a thing that PREVENTS page >faults if that thing makes pages faults impossible to occur >by whatever means. **** That's assuming you establish that "zero page faults" is essential for meeting your high-level requirement. You have only said that if you have tens of thousands of page faults, you cannot meet that requirement, and if you have zero, you have no problem. You have not established if the limit of page faults is zero, one hundred, two hundred and thirty-seven, or three thousand. All you know is that your initial startup is massively slow and you attribute this to the large number of page faults you see. You may be correct. But without detailed analysis, you have no basis for making the correlation. **** > >It is like I am saying I need some medicine to save my life, >and you are say here is Billy Bob he is exactly what you >need because Billy Bob does not kill people. ***** No, but if you want "medicine to save your life" do you take 5mg once a day or 100mg ten times a day? We are not talking alternatives here, but dosages. And do you take it with some sort of buffering drug to suppress side effects, or take it straight (try sitting with a friend undergoing chemo, for three hours each trip, and drive him home, and you will appreciate what buffering means). You have only established two extreme endpoints without trying to understand what is really going on. Do you have detailed performance measurement of the internals of your program? (If not, why not?) Or do you optimize using the by-guess-and-by-golly method that people love to use (remember my basic principle, derived over 15 years of performance measurement: "Ask a programmer where the performance bottleneck is and you will get a wrong answer"? That principle NEVER failed me in 15 years of doing performance optimization). You actualy DON'T have any detailed performance numbers; only some gueses where you have established two samples and done nothing to understand the values between the endpoints! This isn't science, this is as scientific as tossing darts over your head at the listing behind you and optimizing whatever subroutine the dart lands on. **** > >Everyone else here (whether they admit it or not) can also >see your communication errors. I don't see how anyone as >obviously profoundly brilliant as you could be making this >degree of communication error other than intentionally. **** When I tell you "you are using the language incorrectly" and explain what is going on, and give you citations to the downloadable Intel manual, I expect that you will stop using the language incorrectly, and not continue to insist that your incorrect usage is the correct usage. You foolishly think that "virtual memory" necessarily means "paging activity", and in spite of several attempts by Hector and me to explain why you are wrong, you still insist on using "virtual memory" in an incorrect fashion. Where is the communication failure here? Not on my side, not on Hector's side (he pointed you to the Russinovich article). And then you come back, days later, and STILL insist that "virtual memory" == "paging activity" it is really hard to believe we are talking to an intelligent human being. And you still don't have any data to prove that paging is your bottleneck, or to what degree it is a problem. Instead, you fall back on somebody's four-color marketing brochure and equate "meeting a realtime window" (and a HUGE one) with "absolute determinism", which sounds more like a philosophical principle, and insist that without absolute determinism you cannot meet a realtime window, which I tried to explain is nonsense. Paging is only ONE possible factor in performance, and you have not even demonstrated that it matters (you did demonstrate that running two massive processes on a single core slows things down, which surprises no one). **** > >Notice this I am not resorting to ad hominem attacks. > **** I've given up trying to be polite. It didn't work. If I explain something ONCE and you insist on coming back and saying I'm wrong, and persist in using technical language incorrectly, try to justify your decisions by citing scientifically unsupportable evidence, tell use we don't know what we're talking about when you have expended zero effort to read about what we've told you, you are not behaving rationally. Learn how to do science. Learn what "valid experiment" means. Learn that "engineering" means, quite often, deriving your information by performing valid experiments, not thinking that real systems are perfect reflections of oversimplified models described in textbooks, and that you can infer behavior by just "thinking" about how these systems work. This ignores ALL good priniciples of engineering, particularly of software engineering: build it, measure it, improve it. And by MEASURE I mean "RUN VALID EXPERIMENTS!" You have run two that do not not give any guidance to optimization, just prove that certain extreme points work or don't work. Guy L. Steele, Jr., decided that he needed to produce a theoretical upper bound on sorting (we know the theoretical lower bound is O(n log n). He invented "bogo-sort", which is essentially "52-pickup". What you do is randomly exchange elements of the array, then look at it and see if it is in order. If it is in order, you are done, otherwise, try the random rearrangement again until the vector is in sorted order. So you have done the equivalent of running qsort (n log n) and bogo-sort and this tells you nothing about how bubble sort is going to perform. You ran an experiment that overloaded your machine, and one which had zero page faults, and from this you infer that ANY paging activity is unacceptable. This is poor science, and anyone who understands science KNOWS it is poor science. Until you have determined where the "performance knee" is, you have NO data, nor do you know where your problems are, nor do you know where to optimize. SO you take a simplified model, run a single test, and you have STILL not derived anything useful; in fact, your current model is subject to priority inversion and does not guarantee maximum throughput, even in the ABSENCE of page faults. For those of us who spent years optimizing code, this is obvious, and I've tried to tell you that your data is bad, and instead of listening, you insist that your two extreme points are the only points you need to understand what is going on. Not true, not true at all. joe joe **** Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 29 Mar 2010 22:53 See below... On Mon, 29 Mar 2010 12:12:13 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >message news:mcl1r597pthv9priqa6vla6np19l6p0ic1(a)4ax.com... >> See below... >> On Mon, 29 Mar 2010 09:57:59 -0500, "Peter Olcott" >> <NoSpam(a)OCR4Screen.com> wrote: >> >>> >>>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >>>message news:oesvq55safqsrg8jih8peiaah4uiqt0qi3(a)4ax.com... >>>> Well, I know the answer, and I think you are behaving in >>>> yet another clueless fashion. And >>>> in my earlier reply I told you why. You want "fault >>>> tolerance" without even understanding >>>> what that means, and choosing an implementation whose >>>> fundamental approach to fault >>> >>>The only fault tolerance that I want or need can be >>>provided >>>very simply. The original specification of fault tolerance >>>that I provided was much more fault tolerance than would >>>be >>>cost-effective. If I really still wanted this level of >>>fault >>>tolerance then many of your comments on this subject would >>>not be moot. Since this degree of fault tolerance has been >>>determined to never be cost-effective, then any details of >>>providing this level of fault tolerance become moot. >>> >>>The other Pete had greater insights into my own needs than >>>I >>>did myself. I will paraphrase what he said. I only need to >>>avoid losing transactions. When a client makes a request, >>>I >>>only need to avoid losing this request until it is >>>completed. Any faults in-between can be restarted from the >>>beginning. >> **** >> Yout total cluelessness about TCP/IP comes to the fore >> again. Suppose you have >> established a connection to the machine. The machine >> reboots. What happened to that >> connection? Well, IT NO LONGER EXISTS! So you can't >> reply over it! Even if you have >> retained the information about the data to be processed, >> YOU HAVE NO WAY TO COMMUNICATE TO >> THE CLIENT MACHINE! > >False assumption. A correct statement would be I have no way >to communicate to the client that you are ware of (see >below). **** Email is not the same thing as getting the result back from the server. And users will not expect to get email if they get a "connection broken" request unless you tell them, and this requires a timeout a LOT larger than your 500ms magic number. **** > >> In what fantasy world does the psychic plane allow you to >> magically >> re-establish communication with the client machine? > >That one is easy. All users of my system must provide a >verifiably valid email address. If at any point after the >client request if fully received the connection is lost, the >output is sent to the email address. **** Which violates the 500ms rule, by several orders of magnitude. I'm curious how you get a "verifiably valid" email address. You might get AN email address, but "verifiably valid" is a LOT more challenging. THere are some hacks that increase the probability that the email address is valid, but none which meet the "verifiably valid" criterion. *** > >> >> And don't tell me you can use the IP address to >> re-establish connectivity. If you don't >> understand how NAT works, both at the local level and at >> the ISP level, you cannot tell me >> that retaining the IP address can work, because I would >> immediately know you were wrong. >> **** >>> >>>The key (not all the details, just the essential basis for >>>making it work) to providing this level of fault tolerance >>>is to have the webserver only acknowledge web requests >>>after >>>the web request have been committed to persistent storage. >> **** >> Your spec of dealing with someone pulling the plug, as I >> explained, is a pointless >> concern. > >And I have already said this preliminary spec has been >rewritten. **** So what is it? How can we give any advice on how to meet a spec when we don't even know what it is any longer? **** > >> So why are you worrying about something that has a large >> negative exponent in >> its probability (1**-n for n something between 6 and 15)? >> There are higher-probability >> events you MIGHT want to worry about. >> **** >>> >>>The only remaining essential element (not every little >>>detail just the essence) is providing a way to keep track >>>of >>>web requests to make sure that they make it to completed >>>status in a reasonable amount of time. A timeout threshold >>>and a generated exception report can provide feedback >>>here. >> **** >> But if you have a client timeout, the client can resubmit >> the request, so there is no need >> to retain it on the server. So why are you desirous of >> expending effort to deal with an >> unlikely event? And implementing complex mechanisms to >> solve problems that do not require > >Every request costs a dime. If the client re-submits the >same request it costs another dime. Once a request is >explicitly acknowledged as received, the acknowledgement >response will also inform them that resubmitting will be >incur an additional charge. **** Oh, I get it, "we couldn't deliver, but we are going to charge you anyway". Not a good business model. You have to make sure that email was received before you charge. Not easy. We got a lot of flack at the banking system when we truncated instead of rounding, which the old system did, and people would complay that they only got $137.07 in interest when they expected to get $137.08. And you would not BELIEVE the flack we got when we had to implement new Federal tax laws on paychecks, and there were additional "deductions" (the pay was increased by $0.50/payroll to cover the $0.50 additional charge the government required, but again the roundoff meant we were getting complains from people who got $602.37 uner the new system when under the old hand-written checks they got $602.38. So you had better be prepared, under faiilure scenarios, to PROVE you delivered the result they paid for, even for $0.10, because SOMEBODY is going to be tracking it! It will be LOTS of fun! ***** > >> solution on the server side? And at no point did you talk >> about how you do the PayPal >> credit, and if you are concerned with ANY robustness, >> THAT's the place you have to worry >> about it! >> >> And how does PayPal and committed transactions sit with >> your magical 500ms limit and the >> "no paging, no disk access, ever" requirements? >> **** >>> >>>Please make any responses to the above statement within >>>the >>>context of the newly defined much narrower scope of fault >>>tolerance. >> **** >> If by "fault tolerance" you mean "recovering from pulling >> the plug from the wall" my > >No not anymore. Now that I have had some time to think about >fault tolerance (for the first time in my life) it becomes >obvious that this will not be the benchmark, except for the >initial request / request acknowledgement part of the >process. *** So what IS your requirements document? SHOW US! **** > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 29 Mar 2010 22:58 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:hrs1r5l7j283dhcha8oa9n96u1fag61jdu(a)4ax.com... > See below... > On Mon, 29 Mar 2010 09:27:15 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>I know the difference between threads and a process and >>see >>no reason why threads would not work if processes do work >>the converse not necessarily being true. > **** > Memory interference patterns will be completely different > in a multithreaded single > process. > **** >> >>What difference is there between threads and processes >>that >>is the basis for even the possibility that threads may not >>work whereas processes do work? > **** > It is sad you have to ask this question. See above > comment. There are substantial > changes in how the memory system is handled in > inter-process context switching than > intra-process context switching. And the consequences are > quite different. So MAYBE you > will get comparble performance, MAYBE it will be a lot > better, and MAYBE it will not be as > good. I have no idea. But *I* would have run the > experiment so I would KNOW, and not > just be guessing based on the wishful-thinking approach > baed on one irrelevant experiment. > **** Basically faster access because less overhead or the same access because of the same overhead. Yeah so basically you are saying that you are not sure and I should test it. Of course I will when the time comes. If the time ever comes where I really need to upgrade to more than a single core server, at my dime a transaction prices I could hire a team to solve this problem for me. >>Please do not site a laundry list of the differences >>between >>threads and processes, please only cite at least one >>difference between threads and processes along with >>reasoning to explain why they might not be used whereas >>threads worked correctly. > **** > See above. > **** >> >>Here are two crucial assumptions why I think that threads >>must work if processes do work: >>(1) Threads can share memory with each other with most >>likely less overhead than processes. > **** > And higher potential interference. You don't really know. I would think that the issue would be cache or bus contention, and I see no specific differences in the way that threads access memory and the way that processes access memory that could account for difference between these two types of contention. > **** >>(2) Threads can be scheduled on multiple processor cores, >>just like processes. > **** > See previous comment about memory interferene patterns. > The point is, I DON'T KNOW, but I > don't believe in extrapolating from unrelated experiments. > GET THE &%$#ING DATA! > **** It sure is close enough for now, and of course I would test before rolling out another production server. >> >>> >>> You are going to have to come up with a better proposal >>> than one that uses the words "some >>> sort" in it. >>> joe >> >>All that I was saying is that my mind is still open to >>alternatives than the ones that I originally suggested. > **** > It sounded to me like you have not been open to ANY > alternatives we have suggested, and > you don't write specifications using phrases like "some > sort" in them. You write VERY > specific requirements, and from those requirements, > generate VERY specific implementation That is not the way that good architecture is developed. Back in the early days of structured systems analysis they called this getting prematurely physical. If you get too specific too early you lose most any chance of a nearly optimal design., you have committed the garden path error. > strategies. Then, those strategies are reviewed and might > be rejected or accepted, based > on both technical feasibility and whether or not they > satisfy the requirements. And you > have demonstrated that technical feasibility is not one of > your strongest points. Some of these things are brand new to me, and this is the very first time that I have even thought about them. Categorically exhaustive reasoning can derive a near optimal solution to most any problem, but, this takes time. Keeping the focus on categories of ideas instead of detailed specifics is a much more efficient way to interpolate upon the near optimal solution. These categories a gradually made increasingly more specific. > > First, though, you need a very precise requirement, one > which incorporates everything that > is essential and eliminates anything that is now "moot". > So we have something to work > with that is actually self-consistent, and not morphing on > every restatement. Then a > detailed implementation specification which tells exactly > how you plan to meet that > requirement. > joe > **** See what I just said about categorically exhaustive reasoning. This same reasoning through you me and Hector narrowed down the focus to using the HTTP protocol as the best choice for turning my OCR engine into a web application. I don't think that a better category could have possibly been chosen. Categorically exhaustive reasoning pairs down the decision tree most efficiently. >> >>> >>>> >>> Joseph M. Newcomer [MVP] >>> email: newcomer(a)flounder.com >>> Web: http://www.flounder.com >>> MVP Tips: http://www.flounder.com/mvp_tips.htm >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 29 Mar 2010 23:06
"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:55u1r5pogli39eiiumdckpbk8fvm0jfnh3(a)4ax.com... > On Sun, 28 Mar 2010 23:23:09 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >> >>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >>message news:ehcvq55h4l9lrobkpabb2m31ve22nvffd4(a)4ax.com... >>>I had forgotten that! And I DID tell him about >>>VirtualLock! >>> >>> Denser than depleted uranium. >>> >>> What is amazing is his persistence in using technically >>> incorrect terminology even after >>> it has been carefully explained to him what is going on. >>> For example, a desire to >>> allocate contiguous physical memory should have been the >>> first clue that he had no idea >>> what virtual memory was. And his persistence in >>> equating >>> virtual memory (which, at its >> >>If all that other stuff that is going on under the covers >>is >>sufficiently analogous to the simpler case, then it is a >>communication error to specify these extraneous and >>essentially irrelevant details. These extra details impede >>the communication process. > *** > ANd you accuse ME of having communication problems! But > you have serious communication > problems in that once you are told you are using the > technical language incorrectly, you > PERSIST in using it incorrectly, thus impeding any form of > effective communication. > > It is NOT "analogous". You asked a VERY SPECIFIC and > carefully-worded question, and were > told that this was not possible. At which point you began > insisting that it HAD to work > as you described. These details are NOT "irrelevant" but > in fact change the nature of the > problem considerably. A fact which we keep trying to > explain to you! > *** >> >>In other words the simpler less exactly precise terms are >>more correct (within the goal of effective communication) >>than the complex precisely correct terms because the >>precisely correct terms impede the clarity and conciseness >>of communication. > **** > But the precisely correct terms described what is REALLY > going on and the slovenly use of > terms impedes the communication because you are, for > example, asking for the impossible. I am primarily an abstract thinker, I almost exclusively think in terms of abstractions. This has proven to be both the most efficient and most effective way to process complex subjects mentally. I scored extremely high on an IQ test in this specific mode of thinking, something like at least 1/1000. My overall IQ is only as high as the average MD (much lower than 1/1000). I would not be surprised if your overall IQ is higher than mine, actually I would expect it. Your mode of thinking and my mode of thinking seem to be at opposite end of the spectrum of abstract versus concrete thinking. > If you insist on asking for something that is impossible, > you aren't going to get it, no > matter how much you want to. And if you insist on the > impossible, you demonstrate that > you are clueless, particularly after it has been explained > to you that you have asked for > the impossible.If you say "I must have X" and someone says > "that is never going to happen" > then you have to accept that you are not going to get "X" > and stop insisting that you must > have "X". Those of us who make their livings satisfying > the needs of customers have, as > part of our responsibility, making sure that customer > doesn't have unrealistic > expectations, and therefore, when we fail to deliver "X" > means they will say "I wanted > "X", you did not not give me "X", and therefore I am not > going to pay you". Been there, > done that, thirty years ago, and I don't make the same > mistake twice. Intead, I simply > say "You are never going to get "X" because that is not > how the system works. Now here's > what you ARE going to get, that meets your > requirements..." Which is why a coherent set > of requirements is mandatory (a friend of mine has an > interesting approach to pricing. He > asks for the requirements document first. If it is > well-written, his rate is $N/hr. If > it is badly written, or nonexistent, his rate is > $1.5*N/hr. He is very upfront about > this. Based on your morphing spec, I'd suspect he'd > charge $k*N/hr, for k some multiple >>= 2, if he had to implement something for you. > joe > **** > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm |