From: Joseph M. Newcomer on 10 Apr 2010 17:28 See below... On Fri, 9 Apr 2010 20:19:46 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >message news:ialur5h9urt8rlg8h6k1i7a3lpsusojieo(a)4ax.com... >> See below... >> On Thu, 8 Apr 2010 19:58:33 -0500, "Peter Olcott" >> <NoSpam(a)OCR4Screen.com> wrote: >> >>>> All sorts of methods, beginning with a simple straight >>>> shared file. >>> >>>I am beginning with a simple shared file. The only purpose >>>of the other IPC is to inform the process of the event >>>that >>>the file has been updated at file offset X, without the >>>need >>>for the process to poll for updates. File offset X will >>>directly pertain to a specific process queue. >> **** >> Ohh, so you NO LONGER CARE about either power failure or >> operating system crash? > >Hector thinks its a good idea and you don't think that it is >a good idea so I can't go by credibility because although >you have lots of experience and a PhD, Hector has more >experience and more recent experience in this area. Because >of this I weight your credibility equal with his, thus I >need a tie breaker. **** I fail to see why either one matters if you have a fully-transacted file system. **** > >The obvious tie breaker (the one that I always count on) is >complete and sound reasoning. From the sound reasoning that >you have provided, this would go against your point of view. >I know from the SQLite design pattern how to make disk write >100% reliable. I also know all about transactions. By >knowing these two things a simple binary file can be made to >protect against power failures. > >I have brought up another issue several times that you have >not yet addressed. Nothing that you have said would protect >against a OS crash that overwrites the wonderfully >transactional fully flushed buffers transaction database >with garbage data. ***** I have no idea how this could happen. What could write bad data in a file, other than a complete and total failure in the OS? If you have this level of failure, there can be no recovery because the OS is itself erroneous. But there are very few "crash" scenarios that could ever make this happen that they are not worth worrying about. Note that if you try to implement a transacted file system yourself, the more likely cause is an error in how you coded it. ***** > >> >> A "simple shared file" raises so many red flags that I >> cannot begin to say that "this is >> going to be a complete disaster if there is any failure of >> the app or the operating >> system" > >With fully flushed buffers and a journal file this can not >be a problem because that is all that can be done. **** Sorry, that is not a "simple shared file". That is a file with journal logging and transaction support, which is really far from anything I could characterize with the adjective "simple". ***** > >> >> But hey, if it gives the illlusion of working under ideal >> conditions, it MUST be robust >> and reliable under all known failure modes, right? > >Overwriting the file or database or whatever with garbage >data because of an OS crash continues to not be addressed. ***** That is because this never happens in practice. I have not seen any situation in which a file contained garbage in any modern operating system. Only in FAT file systems and under MS-DOS could this have been an issue. OTOH, I have, in some sytems that failed to understand how to do transacted files, had files completely disappear on me. This does not happen in NTFS (you will have some intact, albeit not most recent, version of the file). I do not know how linux handles this. ***** > >> Well, if you are comparing apples and chocolate cupcakes, >> they are pretty much the same. >> Any comparison of linx "named pipes" to Windows "named >> pipes" has to take into >> consideration that they are TOTALLY DIFFERENT mechanisms. >> Shall I tell my set of Unix >> secuirty jokes, or just say "Unix security", which is a >> joke all by itself? So I tend to >> not find ANY comparisons valid. They are two completely >> different systems, which look >> alike only if you stand back a few hundred feet and >> squint. (Windows has a file system; >> linux has a file system; MS-DOS had a file system. They >> are identical only insofar as >> they allow a program to name sequences of bytes stored on >> a disk. But I've NEVER lost a >> file in a Windows crash, and it was common to lose a file, >> and EVERY TRACE of the file, on >> a Unix crash, to the point where I always kept a separate >> directory of files in the hopes >> it would survive the crash. I lost far too many hours due >> to the unreliability of the >> Unix "file system" (if one can dignity anything so >> unreliable with that name). But since >> you know that the file system is utterly reliable, good >> luck. >> joe > >Mission critical apps at AFWA trust Unix. Windows always >take ten times as long to learn anything because they had a >team of a dozen experts assigned to making the design as >convoluted as possible. I need to add Unix to my skill set >to remain employable. ***** I deleted Unix from my skill set years ago. Given a choice between Unix work and no work, I would choose unemployment. I never want to see that OS again, in any guise. **** > >>>Its looking more like four processes with one having much >>>more priority than the others each reading from one of >>>four >>>FIFO queues. >>>(1) Paying customer small job (one page of data) This is >>>the 10 ms job >>>(2) Paying customer large job (more than one page of data) >>>(3) Building a new recognizer >>>(4) Free trial customer >> **** >> As I pointed out earlier, mucking around with thread >> priorities is very, very dangerous > >You did not even pay attention to what I just said. The >Unix/Linux people said the same thing about thread >priorities, that is why I switched to independnet processes >that have zero dependency upon each other. **** Putting threads in separate processes does not change the problem! And I fail to see how threads in the same process would, for this kind of processing, EVER had a "dependency upon each other", or, alternatively, how moving the threads to separate processes solves the fundamental thread scheduler issues that arise. The only thing that changes in the thread scheduler is it now has to load the GDT or LDT with the correct pointers to the page tables of the process that is being switched to. For some reason, you seem to think that "processes" somehow magically possess properties that "threads" do not. Well, sadly, threads are threads and they are the only schedulable entity. Which processes they live in is largely irrelevant. You have completely failed to understand how schedulers work. Thread priorities introduce the same problems no matter which processes they are running in. I'm surprised you missed something this obvious. **** > >> and should NOT be used as a method to handle load >> balancing. I would use a different >> approach, such as a single queue kept in sorted order, and >> because the free trial jobs are >> small (rejecting any larger jobs) there should not be a >> problem with priority inversion. > >Four processes, not threads. Since there is zero dependency >upon each other there is no change of priority inversion. **** Four threads, each running in a private process, do not look substantially different from four threads, all running in one process, at least as far as the scheduler is concerned. Any issue dealing with thread priority issues deals SOLELY with the concept of threads; packaging them into various kinds of processes DOES NOT AFFECT THIS! So you have missed the obvious, because you love buzzword-fixation; if someone refers to thread priorities, you seem to think these work differently than "processes", for reasons that are not only not apparent to anyone else, but actually are completely wrong-headed. You lose. Please stop pretending you understand what you are talking about while accusing the rest of us of failing to understand your nonsensical statements because we disagree with them. Or ignore them, which is pretty much what they deserve. joe **** > > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 10 Apr 2010 18:26 See below.. On Fri, 9 Apr 2010 21:03:43 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >message news:a3our5t87p15br54emp6mnuo2eg3pudcb8(a)4ax.com... >> On Thu, 8 Apr 2010 21:10:25 -0500, "Peter Olcott" >> <NoSpam(a)OCR4Screen.com> wrote: >> >>> >>>Yes that is it. I don't even acknowledge receipt of the >>>request until it is committed to the transaction log. >>>Anything at all that prevents this write also prevents the >>>acknowledgement of receipt. So basically I never say "I >>>heard you" until the point where nothing can prevent >>>completing the transaction. >> **** >> OK, this is a good specification. I'm not sure how the >> current proposal, which doesn't >> have anything resembling a reliable log, accomplishes it. >> **** > >OK so I have to specify every single minuscule little detail >step by step item by item to let you know that I know how to >make a reliable log file? I will simply use what I have >referred to as the SQLite design pattern. **** No, but when you get down to the implementation, you had better know how to do all of this in absolutely the correct sequence. Of course, I've done this, so I have some idea of how hard it is even when you have complete control of the disk (as you point out below, some of the issues may be outside your control, and thus make it impossible to get right). But I don't believe you can take a high-level description of SQLLITE and magically re-implement it to the level of detail required to make it work perfectly.I think you are being terribly optimistic here; I had to work carefully to make ISAM overflow records work right, and I've watched others build fully-transacted systems and I know that the simplistic model is overly simplified. **** > >It seems that the most difficult aspect of this is to make >sure that each and every one of the every kind of buffer is >completely flushed to the disk platters. The difficult part >of this is actually turning off the drive's write cache, >when there may not be any software (hard disk driver) that >actually does this, and making sure the fsync() is not >broken as it often is, and making sure that fsync() is >applied everywhere that it needs to be applied which is at >least the file and the directory. **** These are all serious and important problems which must be solved or understood before undertaking any project that depends on them. **** > >If I can make sure of these things and follow the SQLite >journaling design pattern then I can make reliable >transactions. If fsync() is broken and/or the drive's write >cache can't be turned off, then all of the great safe >transaction advice that you have provided becomes moot. If >you can't flush the buffers then safe transactions can't be >made. **** Yep, that's a potential problem. **** > > >>>Then in the event that I do not receive the HTTP >>>acknowledgement of final receipt of the output data, I >>>roll >>>the whole transaction back. If the reason for the failure >>>is >>>anything at all on my end I roll the charges back, but, >>>let >>>the customer keep the output data for free. If the >>>connection was lost, then this data is waiting for them >>>the >>>next time they log in. >> **** >> And you guarantee this exactly HOW? Oh yes, with the >> transacted database (I thought this > >No there is more to it than that. I have to have explicit >acknowledgement from the client. ***** This is just one more workflow state; and I've seen nothing describing how you recover a partial workflow state from the records you are maintaining, or even an acknowledgement that this is a difficult problem because of the large number of states. joe **** Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 10 Apr 2010 18:58 See below,,, On Fri, 9 Apr 2010 21:28:05 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in >message news:leour5hk0g6agvg2bq0ubgcve3riso0p9q(a)4ax.com... >> See beklow... >> On Thu, 8 Apr 2010 20:40:34 -0500, "Peter Olcott" >> <NoSpam(a)OCR4Screen.com> wrote: >> >>>Solution is new vendor where disk caching can be turned >>>off. >>>Vendor says that disk caching can be turned off. Vendor >>>rep >>>may be guessing. >> **** >> Note also that the existence of an ATAPI command to invoke >> an action does not guarantee >> the existence of an API that will send that ATAPI command. >> So you need a guarantee that >> the OS and/or the DBMS can actually activate this feature! >> >> We discovered that even though our system wanted to take >> advantage of certain features of >> a particular vendor's disk drive, we could not invoke them >> (for example, the SCSI >> pass-through was broken in the SCSI device driver!). So >> EVERY component of the system, >> from the application through the OS through the low-level >> disk drivers through the >> hardware on the disk drive must support the ability to >> invoke some state in the hardware. > >Yes. I think that the drives might be SCSI, I could not even >verify this yet. **** Note that this requires that you determine the vendor ID of the drive, the technology type, and deduce from these the command sequence required to turn off caching. THis means you have a table of every vendor, every drive model, and every associated command string. Of course, you could let the low-level drivers handle this, providing they have an API to invoke this action. ***** > >>>But required OS reboots are, right? Still need all writes >>>to >>>go straight to the platters. >>> >> **** >> Note that when the OS reboots, the reboot procedure has, >> as one of its effects, the >> completion of pending writes to the disk. When you say >> "required reboot", or course, you >> are referring to the kinds of reboots that happen after >> updates to software, or any other > >Nope. That part is easy to deal with. The one that I am >talking about is like the MS Windows blue screen of death >system hang. **** That is not a "required reboot", that is a "system crash", and a transacted database system is impervious to these if it has been implemented properly. There is no way in the world you can refer sensibly to a BSOD as being a "system hang", it is a "fatal system termination". And it is not a "required reboot", it is a "crash". And all my systems are configured to auto-restart, so there is no human intervention; if the system crashes when I am sound asleep, and all the services restart, they are responsible for determining their workflow state for every in-flight task and properly restarting it. And this is what you have failed to deminstrate that you have taken into consideration! joe **** > > >>>Good reason for hot swappable RAID, then. >> **** >> I presume you mean RAID-5. And that you will maintain a >> set of spare hard drives in their >> carriers for this contingency (I do) > >Nope RAID 5 costs $550, RAID-1 costs $75. **** Hmm...I presume you are pricing this by cost of media. So if I need three drives to support RAID-5, that is $150, not $550. And if you think a RAID-5 controlller card is required, note that "software RAID" is a VERY old technology! ***** > >>>Make sure the flush to disk then. >> **** >> The OS just crashed. Or your unlikely power-failure >> scenario just happened. So the files >> are flushed to disk exactly HOW? > >Simple every single write it always completely flushed to >disk immediately, and the SQLite design pattern (journaling) >involving keeping an explicit audit trail every single disk >write. Now a crash only loses the current pending >transaction, and the tiny bit of garbage data can be >deleted. ***** Yes, but since you keep accusing me of not paying attention to what you are saying, but you seemed to imply that the disk flush was sufficient protection. Note that when you are appending to a file, you have to make sure the file length in the directory is also updated. I presume you have covered that contingency in your design? Note that it is an implied part of the "pattern" you think is complete. ***** > >>>I already solved this issue with my very early design. >>>That >>>is the purpose of my persistent disk file based FIFO >>>queue. >>>As soon as it gets to this file, then even a server crash >>>will not prevent the job from getting completed correctly. >>>We may lose the way to send it back to the user's screen >>>(it >>>will still be in their account when they log in) but we >>>did >>>not lose the actual transaction even if the server >>>crashes. >> **** >> As I recall, because SQLLITE could not have record numbers >> and do a seek, you abandoned >> the persisten disk-based FIFO queue in favor of named >> pipes (which have ZERO robusteness >> under most failure scenarios, but can add infinite amounts >> of kernel memory so they can >> keep growing, so they have some advantages; run out of >> memory, just create a pipe and >> write to it until you get more memory added by magic) > >I have stated my design so many times and you still don't >even know what I said? **** Because every time I see it, there is some new requirement added that is inconsisten with previously-stated "non-negotiable" requirements; e.g., there must be no page faults and we must use a transacted database. ***** >A single transaction log file is the FIFO queue, and the >named pipes merely notify the processes of the offset within >this file of the relevant change. ***** Huh? I don't even follow this logic. And you are back to talking about an implementation when we don't even know what the requirements are! ***** > >> In fact, the last I saw, you had FOUR of these queues, all >> badly designed, all working >> with processes that were using the worst possible approach >> to handling prioritization, >> minimizing concurrency, maximizing response time. Not >> clear that this is forward >> progress. > >Yet you have yet to explain this critique in terms of >reasoning. Much of the reasoning that you do provide is a >critique of your misconception of my design rather than the >design itself. I explained using reasoning why your idea of >a single priority queue would not work. You don't explain >yourself nearly enough. ***** Since there is no one place I have seen the design, and have to assemble it from the bits-and-pieces we see, which morph every few messages, it is really hard to say what the design is. But anyone who has studied queueing models knows that there are many complex tradeoffs, and mutiqueue multisever models are subject to some serious problems. A simple simulation would demonstrate this. Why do you think banks moved to the single-queue model? It maximizes concurrency, minimizes average response time to any individual request, and is easier to manage. This is reality. Try studying it, instead of using the "think" system to come up with ideas that everyone else abandoned in the 1970s. Do you know why multiprocessor systems use a single priority-ordered queue model? Because having a queue-per-CPU doesn't work! I am not going to offer a course in basic queueing theory or elementary realtime scheduling here. I suggest doing some reading. joe ***** > >If you explained yourself better I would be able to more >easily correct your misconceptions and you would be better >able to correct my misconceptions. I might not have the >right conception of a priority queue. The intuitive >conception of a priority queue will not work. ***** YOu are not even close. I don't even know what an "intuitive conception of a priority queue" means, let alone how to explain it. I just pointed out that we have learned that a single priority-ordered queue with anti-priority-inversion protection will give better service than four queues each handled by a separate thread (whether that thread is in a separate process or not doesn't really matter!) Is it not obvious why the multiqueue model maximizes latency and minimizes concurrency? Here's the exercise: you get 50 paying customers with short (10ms-class) transactions, eight paying customers with long (3 mninute) transactions, and 30 trial free customers (you have rejected any long jobs because it is the trial deal, so they can only submit short requests). Compute the time required to complete this request and the expected completion time under a single-queue and your multiqueue model. So, the long requests will take 24 minutes for the last one to be completed (under your scenario). the 50 paying customers, if each takes 10ms, will mean the maximum result time is 500ms, and for your free trial customers, 300ms for the longest result. Now, under a multiserver priority model, your paying short customers go to the front of the queue, where there are 4 threads to process the requests. The expected maximum response time will be about 125ms (12.5 x 10ms). This delays the 3-minute jobs, but if you schedule no more than 3 concurrently (to avoid priority inversion) then the maximum time is no more than 9 minutes. Your free customers, because there is only one thread left to process short requests, means that they will be delayed by the paying customers (125ms) and by having only one thread to process them, so you still get 300ms max for them, plus the latency for one thread to process paying customers, so 425ms. But if there are no long jobs tying up the threads, then they will see 125ms + 75ms = 200ms, faster than the multisever model! If you cannot do simple arithmetic like this, you are in deeper trouble than I thought! I didn't realize I had to explain reasoning that involved third-grade concepts, figuring most readers could work this out for themselves. joe > >>>Alternatively if we lose any part of the process before we >>>get an HTTP acknowledgement that they received their >>>results, we roll the whole transaction back. >> **** >> Actually, you either lose all of a process, or none of it. >> What you are trying to say, I >> think, is that if you suffer a failure at any point in the >> workflow state machine, there >> is some totally magical means that gets you back to the >> magically constructed recovery >> software that restarts the workflow at some point. > >Not magical at all, but, I am getting very tired of >constantly repeating myself and you continuing to read what >you thought that I said instead of what I actually said. I >explained exactly how this would work a dozen times, and you >never once explained why it would not work with complete >consistent and sound reasoning. **** You clearly said "any part of the process". A process does not lose a "part" of itself. EIther the process is running or it isn't. Perhaps you meant to say "If, during the workflow state machine, there is a failure, then the entire in-flight workflow item is deemed lost, and the computation is restarted from the beginning, even if the computation had finished before the crash that lost the state". which is probably a more accurate characterization of a recovery strategy. Now you have to demonstrate that your state machine and your recovery code capture this specification. And how they know what to restart, and where the information required to do the restart is stored, and how it is known that this information is still valid. **** > >I know that you are very bright and knowledgeable man. I >know that you know much more about these things that I do. >Please lets make this more of a dialogue and much less of a >debate. ***** I'm not debating. I'm telling you flat out that your design is not very good, your design methodology needs some work, and you are driving your requirements document from implementation ideas which are not well-thought-out or in some cases completely misunderstood. I've told you several times you need to work out the workflow DFA and know what happens if there is a failure at any state in that graph. joe **** > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 10 Apr 2010 22:01 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:eio1s51kp7j8n5p9icbr1onv1nughqgm98(a)4ax.com... > See below... > On Fri, 9 Apr 2010 20:36:34 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>The other issue is reentrancy. I remember from my MS-DOS >>(ISR) interrupt service routine development that some >>processes are occasionally in states that can not be >>interrupted. One of these states is file I/O. Now the >>whole >>issue of critical sections and other locking issues has to >>be dealt with. A simple FIFO made using a named pipe >>bypasses these issues. > *** > By definition, NO process is EVER in a non-interruptible > state, in either linux or I do explicitly remember that there was and probably still is a mask interrupts flag that is directly on the processor chip itself. I do also remember that there still are critical section mutexes. Sure you can interrupt a transaction with another transaction. the probelm is that you get the wrong answer when you do this. > Windows. File I/O is ABSOLUTELY interruptible in Windows, > NO EXCEPTIONS. If you believe > otherwise, you are fooling yourself. But it is probably > the fact that I have written a My experience with this is with MS-DOS. > book on developing WIndows device drivers, and teach > courses in it, and have talked to > linux device driver writers (who are often my students) > that gives me the knowledge to > know you are spouting nonsense. And I should point out > that a simple FIFO using a named > pipe that has the magical property of growing > indefinitely, or the failure mode ot > discarding data, is NOT the solution to this non-problem. Someone finally answered my question about FIFO caveats and said that it only can grow a tiny little bit, and then it refuses input. >>That right I discarded threads in favor of processes a >>long >>time ago. > **** > Note that a process IS a thread. It happens to have a > private address space, but the Right thus preventing most every or every mutual dependency, and thus the possibility of priority inversion because priority inversion (according to several sources) can only occur with some sort of mutual dependency. > threads in those processes behave just like threads in a > single process, a point you have > obviously missed. You have to manage 4 named pipes with 4 > threads; the fact that those Oh so then you just lied about the address space being separate? The best that I can tell is that the separate address space by itself almost completely eliminates the possibility of any mutual dependency, and a mutual dependency is absolutely required for priority inversion. If you don't start explaining yourself more completely I will reasonably assume that your whole purpose here is to be a naysayer whose sole purpose is to provide discouragement and you have no intention at all of being helpful. > threads are in separate processes does not change the > nature of the problem, or change the > fundamental weaknesses of the design. Don't call it a weakness in the design unless you also point out detail by detail exactly why you think this is a weakness because more often than not your whole conception of my design weakness is entirely based on your own false assumptions about my design. >>One web server (by its own design) has multiple threads >>that >>communication with four OCR processes (not threads) using >>some form of IPC, currently Unix/Linux named pipes. > **** > You have missed so many critical issues here that I would > be suprised if you could ever > get this to work in a satisfactory manner. > > You are assuming the named pipes have infinite capacity, > which they will not. You do not > seem to have a plan in place to deal with the inevitable > failure that will occur because > the pipes do not have infinite capacity. Someone on the Linux/Unix side finally got around to mentioning that their capacity is 4K. > > You think a handwave that uses threads in separate > processes somehow magically makes the > multithreading concurrency issues disappear. They will > not. And your design for See there yet another false assumption. My four OCR processes will only have a single thread each. I NEVER SAID OR IMPLIED OTHERWISE I have a bad design because you measure my design against your misconception of it rather then the design itself. Unless you provide much more detailed reasoning I have no way to point out your false assumptions. I make a lot of false assumptions too, but, at least I state what my assumptions are so that it is possible to correct them. > scheduling OCR processing sucks. It allows almost no > concurrency, thus maximizing latency > time and severely underutilizing the available computing > resources. I don't want to maximize the overall throughput of all jobs, I only want to maximize the throughput of the high priority jobs. I don't care if this makes all of the rest of the jobs twice as inefficient. >>And yet you continue to fail to point out the nature of >>these holes using sound reasoning. I correctly address the >>possible reasoning, and you simply ignore what I say. > **** > I largely ignore what you say because largely what you say > is either bad design or > completely nonsensical. That way is makes communication far too difficult because this masks you own false assumptions. This makes perfect sense to mask your own false assumptions if the purpose is to provide discouragement rather than to be helpful. If you continue to mask your false assumptions in this way I will give up on you. > I have tried to point out the reasoning, such as a > multiqueue > model failing to maximiz concurrency, failing to minimize > latency, and failing to utilize > all of the computing resource. I have pointed out that a > single-queue design has none of > these failures, and you can prevent priority inversion by > any number of well-known > schemes, which I leave it up to you to do a little basic > research on. That you are > relying on a queueing mechanism which you assume either > has properties no queueing system > can ever possess (e.g., infinite queue size) or fail to > have a plan in place to address > what happens on queue overflow, and by hypothesizing the > infinite-queue model is going to > be sufficient, you choose ONE implementation which is > probably less than ideal in a number > of ways, because of something you saw in some NG, or found > in a poorly-written 'man' page, > or heard about being in WikiPedia. > > You have confused a requirements document with an > implementation specification; you > apparently let your bad ideas about implementation > techniques drive your requirements > document, which is a poor way to arrive at a design. > > You have failed to address the problem of failure points, > something I've been telling you > about for probably a week; instead you either select > inherently unreliable mechamisms > (linux named pipes, which are reliable ONLY IF no process > ever fails and the system can > never crash, and that's assuming they work correctly), > reject known reliable mechanisms > (transacted database systems), think that certain APIs > will magically posess properties > they have never had (e.g., pwrite), and I'm supposed to > PAY ATTENTION to this blather? > > When I start seeing a coherent approach to design, I'll > start paying attention. >> >>> resources and maximum unused resources, but what does >>> maximum resource utilization and >>> minimum respons time have to do with the design? It >>> guarantees priority inversion, >> >>Yeah it sure does when you critique your misconception of >>my >>design instead of the design itself. I use the term >>PROCESSES and you read the term THREADS. Please read what >>I >>actually say, not what you expect that I will say or have >>said. > **** > When you say something like "I am using a named pipe" and > give a set of silly reasons, > when you start telling me about MS-DOS File I/O, when you > spout nonsense about > interruptibility, when you systematically ignore the > complex issues of workflow state > management and disaster recovery, I think I have seen what > you are saying. Of course, the > fact that it is all nonsense anyway doesn't hurt my > ability to see that it is not forming > a good design. Good designs have certain easily > recognized properties. Since I've had to > do realtime programming, I know that the multiqueue > approach is actually a very bad one. A > single queue with anti-priority-inversion (even a simple > one such as not scheduling more > than K < N long computations, for N the number of service > threads is an improvment over > the kludge you are proposing) would be a tremendous > improvement, allowing more concurrency > and reducing latency, which I would think would be > desirable goals. > > The interesting thing is you can write SIMULATIONS of > these models and show beyond any > shadow of a doubt that your proposed scheme will give some > of the worst possible behavior, > but that would actually require doing actual measurments, > and not immediately deducing it > from the Tarot cards or I Ching. Yet a simple closed-form > queueing model will also show > this is a bad idea! > > You want solid reasoning? Take a course in realtime > scheduling issues! I do not plan to > teach one in this forum. > joe > **** >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 10 Apr 2010 22:04
"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:peq1s51e061rsv2h2jc8fj9samuv2r1i3g(a)4ax.com... > See below... > On Fri, 9 Apr 2010 20:19:46 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>Hector thinks its a good idea and you don't think that it >>is >>a good idea so I can't go by credibility because although >>you have lots of experience and a PhD, Hector has more >>experience and more recent experience in this area. >>Because >>of this I weight your credibility equal with his, thus I >>need a tie breaker. > **** > I fail to see why either one matters if you have a > fully-transacted file system. > **** >> >>The obvious tie breaker (the one that I always count on) >>is >>complete and sound reasoning. From the sound reasoning >>that >>you have provided, this would go against your point of >>view. >>I know from the SQLite design pattern how to make disk >>write >>100% reliable. I also know all about transactions. By >>knowing these two things a simple binary file can be made >>to >>protect against power failures. >> >>I have brought up another issue several times that you >>have >>not yet addressed. Nothing that you have said would >>protect >>against a OS crash that overwrites the wonderfully >>transactional fully flushed buffers transaction database >>with garbage data. > ***** > I have no idea how this could happen. What could write > bad data in a file, other than a > complete and total failure in the OS? If you have this > level of failure, there can be no > recovery because the OS is itself erroneous. Oh, not even on the fly transaction-by-transaction offsite backup? Even if the OS itself becomes erroneous I can not allow this to put me out pf business. > But there are very few "crash" scenarios > that could ever make this happen that they are not worth > worrying about. > > Note that if you try to implement a transacted file system > yourself, the more likely cause > is an error in how you coded it. > ***** >> >>> >>> A "simple shared file" raises so many red flags that I >>> cannot begin to say that "this is >>> going to be a complete disaster if there is any failure >>> of >>> the app or the operating >>> system" >> >>With fully flushed buffers and a journal file this can not >>be a problem because that is all that can be done. > **** > Sorry, that is not a "simple shared file". That is a file > with journal logging and > transaction support, which is really far from anything I > could characterize with the > adjective "simple". > ***** >> >>> >>> But hey, if it gives the illlusion of working under >>> ideal >>> conditions, it MUST be robust >>> and reliable under all known failure modes, right? >> >>Overwriting the file or database or whatever with garbage >>data because of an OS crash continues to not be addressed. > ***** > That is because this never happens in practice. I have > not seen any situation in which a > file contained garbage in any modern operating system. > Only in FAT file systems and under > MS-DOS could this have been an issue. > > OTOH, I have, in some sytems that failed to understand how > to do transacted files, had > files completely disappear on me. This does not happen in > NTFS (you will have some > intact, albeit not most recent, version of the file). I > do not know how linux handles > this. > ***** >> >>> Well, if you are comparing apples and chocolate >>> cupcakes, >>> they are pretty much the same. >>> Any comparison of linx "named pipes" to Windows "named >>> pipes" has to take into >>> consideration that they are TOTALLY DIFFERENT >>> mechanisms. >>> Shall I tell my set of Unix >>> secuirty jokes, or just say "Unix security", which is a >>> joke all by itself? So I tend to >>> not find ANY comparisons valid. They are two completely >>> different systems, which look >>> alike only if you stand back a few hundred feet and >>> squint. (Windows has a file system; >>> linux has a file system; MS-DOS had a file system. They >>> are identical only insofar as >>> they allow a program to name sequences of bytes stored >>> on >>> a disk. But I've NEVER lost a >>> file in a Windows crash, and it was common to lose a >>> file, >>> and EVERY TRACE of the file, on >>> a Unix crash, to the point where I always kept a >>> separate >>> directory of files in the hopes >>> it would survive the crash. I lost far too many hours >>> due >>> to the unreliability of the >>> Unix "file system" (if one can dignity anything so >>> unreliable with that name). But since >>> you know that the file system is utterly reliable, good >>> luck. >>> joe >> >>Mission critical apps at AFWA trust Unix. Windows always >>take ten times as long to learn anything because they had >>a >>team of a dozen experts assigned to making the design as >>convoluted as possible. I need to add Unix to my skill >>set >>to remain employable. > ***** > I deleted Unix from my skill set years ago. Given a > choice between Unix work and no work, > I would choose unemployment. I never want to see that OS > again, in any guise. > **** >> >>>>Its looking more like four processes with one having >>>>much >>>>more priority than the others each reading from one of >>>>four >>>>FIFO queues. >>>>(1) Paying customer small job (one page of data) This >>>>is >>>>the 10 ms job >>>>(2) Paying customer large job (more than one page of >>>>data) >>>>(3) Building a new recognizer >>>>(4) Free trial customer >>> **** >>> As I pointed out earlier, mucking around with thread >>> priorities is very, very dangerous >> >>You did not even pay attention to what I just said. The >>Unix/Linux people said the same thing about thread >>priorities, that is why I switched to independnet >>processes >>that have zero dependency upon each other. > **** > Putting threads in separate processes does not change the > problem! And I fail to see how > threads in the same process would, for this kind of > processing, EVER had a "dependency > upon each other", or, alternatively, how moving the > threads to separate processes solves > the fundamental thread scheduler issues that arise. The > only thing that changes in the > thread scheduler is it now has to load the GDT or LDT with > the correct pointers to the > page tables of the process that is being switched to. > > For some reason, you seem to think that "processes" > somehow magically possess properties > that "threads" do not. Well, sadly, threads are threads > and they are the only schedulable > entity. Which processes they live in is largely > irrelevant. You have completely failed > to understand how schedulers work. Thread priorities > introduce the same problems no > matter which processes they are running in. I'm surprised > you missed something this > obvious. > **** >> >>> and should NOT be used as a method to handle load >>> balancing. I would use a different >>> approach, such as a single queue kept in sorted order, >>> and >>> because the free trial jobs are >>> small (rejecting any larger jobs) there should not be a >>> problem with priority inversion. >> >>Four processes, not threads. Since there is zero >>dependency >>upon each other there is no change of priority inversion. > **** > Four threads, each running in a private process, do not > look substantially different from > four threads, all running in one process, at least as far > as the scheduler is concerned. > Any issue dealing with thread priority issues deals SOLELY > with the concept of threads; > packaging them into various kinds of processes DOES NOT > AFFECT THIS! > > So you have missed the obvious, because you love > buzzword-fixation; if someone refers to > thread priorities, you seem to think these work > differently than "processes", for reasons > that are not only not apparent to anyone else, but > actually are completely wrong-headed. > > You lose. Please stop pretending you understand what you > are talking about while accusing > the rest of us of failing to understand your nonsensical > statements because we disagree > with them. Or ignore them, which is pretty much what they > deserve. > joe > **** >> >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm |