Next: C++/CLI limitations?
From: Bob Bell on 31 Aug 2005 17:09 Nicola Musatti wrote: > Bob Bell wrote: > > Gerhard Menzl wrote: > > > Bob Bell wrote: > > > > > > > I should be more specific. I interpret "the function cannot continue" > > > > to mean that the function shouldn't be allowed to execute a single > > > > instruction more, not even to throw an exception. > > > > > > Not even log and assert? > > > > If I said "it's OK to log and assert", would that invalidate my point > > or support yours? The point is to make as few assumptions about the > > state of the system as possible, which leads to executing as little > > code as possible. The problem with throwing is that it assumes that the > > entire state of the system is still good, and that any and all code can > > still run. > > Excuse me, but don't you risk assuming too much in the other direction? What's the risk? Does it outweigh the risk of shipping a buggy program? > Consider for example a function as the following: > > double safeSqrt(double arg) { > if ( arg < 0 ) > // what goes here? > return std::sqrt(arg); > } > > Wouldn't it be a bit extreme to assume the world has ended just because > this function was passed a negative number? It may seem extreme, but making that assumption gives you an opportunity to detect and fix a bug. Again, the alternative is to assume that the world is OK when a negative number is passed, and that is clearly wrong. When balancing "seems a bit extreme" against "clearly wrong", I'll go with the "seems a bit extreme" option. > On the other hand I agree that if the world has actually ended, we > wouldn't want to add damage to it. So what can we do about it? You are > probably right that exception handling is not to be trusted and it > seems to me that the least action you can take is to return a > conventional value. It comes back to: -- you have a function with a precondition that says its argument must be non-negative or the function can't perform its job sensibly; therefore, if a negative number is passed, there must be a bug -- writing the function so that it returns a value when a negative number is passed just lets a bug go unnoticed > Should we reach the conclusion that returning error codes is better > than exceptions for writing really robust code? ;-) I know you're joking, but returning any value (error code or otherwise) has the same problem that throwing an exception does; it implicitly assumes that the world is OK and any code can still run. Bob [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: David Abrahams on 31 Aug 2005 17:05 "Nicola Musatti" <nicola.musatti(a)gmail.com> writes: > Bob Bell wrote: >> Gerhard Menzl wrote: >> > Bob Bell wrote: >> > >> > > I should be more specific. I interpret "the function cannot continue" >> > > to mean that the function shouldn't be allowed to execute a single >> > > instruction more, not even to throw an exception. >> > >> > Not even log and assert? >> >> If I said "it's OK to log and assert", would that invalidate my point >> or support yours? The point is to make as few assumptions about the >> state of the system as possible, which leads to executing as little >> code as possible. The problem with throwing is that it assumes that the >> entire state of the system is still good, and that any and all code can >> still run. > > Excuse me, but don't you risk assuming too much in the other direction? > Consider for example a function as the following: > > double safeSqrt(double arg) { > if ( arg < 0 ) > // what goes here? > return std::sqrt(arg); > } > > Wouldn't it be a bit extreme to assume the world has ended just because > this function was passed a negative number? Sure. If you want to throw an exception there, just document it and don't call arg >= 0 a precondition. If you call it a precondition, invoking safeSqrt with a negative number becomes a bug. Then what's "safe" about that function? All the caller knows when an exception emanates from it is that he's got a bug somewhere in his code. If it's a precondition, and you want to write code that tries to take the safeSqrt of a million numbers, you have to check each one first to make sure it's non-negative, or you have a bug in your code. If it's not a precondition, wrap the whole thing in a try/catch block. "It's easier to ask forgiveness than permission" ;-) > On the other hand I agree that if the world has actually ended, we > wouldn't want to add damage to it. So what can we do about it? You are > probably right that exception handling is not to be trusted That's not Bob's point at all. > and it seems to me that the least action you can take is to return a > conventional value. > > Should we reach the conclusion that returning error codes is better > than exceptions for writing really robust code? ;-) No. If you want your code to be robust, be clear about the difference between preconditions and the conditions that generate error codes and exceptions. Some people like to avoid the word "error" in connection with the latter category, so _Programmer errors_ lead to precondition failures which invoke undefined behavior _Exceptional conditions_, such as resource allocation failures and negative arguments to a safeSqrt that throws, are expected, and generate exceptions or abnormal return values or ...whatever other mechanism you choose to report the condition. Is that clearer? -- Dave Abrahams Boost Consulting www.boost-consulting.com [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Gerhard Menzl on 1 Sep 2005 07:17 Bob Bell wrote: > If I said "it's OK to log and assert", would that invalidate my point > or support yours? The point is to make as few assumptions about the > state of the system as possible, which leads to executing as little > code as possible. The problem with throwing is that it assumes that > the entire state of the system is still good, and that any and all > code can still run. No, I am not trying to invalidate your point in any way. When I point out what I perceive as inconsistencies I do so in order to to increase my understanding and, hopefully, achieve mutual agreement on a more refined level. When you say that "the function shouldn't be allowed to execute a single instruction more", logging and asserting would be impossible. "Executing as little code as possible", on the other hand, sounds reasonable to me and eliminates the contradiction. I still cannot reconcile this guideline with Dave's point that unwinding the stack is (almost always) wrong, but starting a separate undo/recovery mechanism isn't. This may be due to misunderstanding. > If you mean that you want to avoid crashes/ungraceful shutdowns when > end-users use the system, I agree. That is what I am concerned about, and it's not just because of trying to be nice to users. The software I am currently working on is part of a larger system that has an abysmal record. The team was recently confronted with an ultimatum set by the customer along the lines of: you've got six more weeks until the final test; if there's one single crash (and from a customer's view, this includes assertions and the like), you're out - the end of a contract of several million dollars as well as dozens of jobs. From this perspective, one cannot help eyeing statements like "terminating the program is good because it helps debugging" with a certain reserve. You don't have to tell me that there has to be something seriously wrong with the development process to get into a situation like this in the first place, but unfortunately, development in the real world does not always take the perfect (or even reasonably sound) process path. > In practice this doesn't happen (at least, in my practice; can't speak > for anyone else). Instead, liberal usage of assertions to trap > precondition violations as bugs leads to finding and fixing a lot of > bugs. Perhaps you should try it before deciding that it doesn't work. I am sorry if I should have created the impression that I have decided the approach doesn't work. I *do* make liberal use of assertions. But you have to take into account that the more liberally you use assertions, the more likely it is that you err on the other side, i.e. that you designate a condition as a result of a bug when in reality it is a possible, if exotic program state. > I don't know why it should. I'm not programming with Eiffel, and as > far as I know, neither are you, so why should it matter what > "precondition" means in Eiffel? Lots of terms are used differently by > the two camps. You don't seem to have trouble discussing exceptions, > despite the fact that the term means different things in the two > languages. The answer to this is easy: because, to the best of my knowledge, the concept of Design by Contract originated in and is most fervently advocated by the Eiffel camp. Althogh it is being supported directly in Eiffel, it is abstract enough not to be tied to that language. There may be differences in implementation, but the fundamentals should be the same. If, however, there is a sound argument why preconditions should be defined and/or handled differently in C++, I would like to see it. I am well aware that certain technical terms mean different things in different parts of the software engineering community, but I also think that redefining terms gratuitously should be avoided. > What's the alternative? Saying that the program's state is partially > undefined? Or that some subset of the state is undefined, while the > remainder of the state is well-defined? That kind of fuzzy thinking is > something I don't understand. It often turns out to be wrong, and > leads to missed opportunities to fix bugs. How about an example? Suppose you have a telephony application with a phone book. The phone book module uses std::binary_search on a std::vector. A precondition for this algorithm is that the range be sorted. A bug causes an unsorted range to be passed. Leaving aside the fact that detecting the violation of this precondition may be a bit costly, how would you expect the application to react? Abort and thus disconnect the call in progress although it's only the phone book that is broken? Exhibit undefined behaviour, such as (hopefully not more than) displaying garbage? Notify the user of the problem but let him finish the call? Would the latter cause the precondition cease to be a precondition? This is not meant to be polemic; I am genuinely interested. As for fuzzy thinking, "something's amiss somewhere" sounds more fuzzy to me than "something's amiss in this module/function". Sure, a bug can surface at a point far from its source: writing to arbitrary locations in memory is an example. But is it feasible always to react as if this were the case, although in the majority of cases the cause is probably to be found locally? > One other pragmatic reason to stop the program and fix the bug the > moment the bug is detected is that you never know when the bug is > going to recur and you'll get another opportunity. What kind of scenario do you have in mind? If your program aborts at a remote user site, no immediate fixing is going to take place. I fully agree that masking bugs and just plodding on is bad practice, I just doubt that aborting is the only means of preventing it. > Precondition failures indicate bugs, and the right thing to do is fix > the bug; just about the worst thing you could do is throw an > exception, since throwing an exception is tantamount to ignoring the > bug. Why you do you equate throwing exceptions with ignoring bugs? In my application, the top level exception handler tries to write as much information as possible to a log file. It then informs the user that a serious error has happened, that the application may be in a shaky state and had better be terminated, where the log file is, and that an administrator should be called to collect the information and forward it to the vendor. Admittedly, the user could choose to ignore the notice and carry on, but then he could also restart an aborted application adn carry on. Or are you concerned about sloppy exception handling practices in larger teams? -- Gerhard Menzl #dogma int main () Humans may reply by replacing the thermal post part of my e-mail address with "kapsch" and the top level domain part with "net". [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Nicola Musatti on 2 Sep 2005 07:03 David Abrahams wrote: [...] > _Programmer errors_ lead to precondition failures which invoke > undefined behavior Ok. What you are saying is you don't know where you are so your best option is give up immediately, lest you cause additional damage, correct? I see two issues that arise from this point of view: how to implement "giving up" and what kind of recovery can be performed. Ideally one would want to collect as much information as possible on what went wrong for diagnostic purposes. On Unix systems generating a core dump is a convenient option, on other systems it might not be so easy. On the system I work on I don't have core dumps, but my debugger breaks on throws, so at least in development builds I implement assertions by throwing exceptions. As far as recovery is concerned I agree that the current module/process cannot be trusted anymore and recovery should take place at a different level: by some monitoring process, by automatic reboot, by requiring human intervention or whatever. Cheers, Nicola Musatti [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: David Abrahams on 2 Sep 2005 07:10
Gerhard Menzl <gerhard.menzl(a)hotmail.com> writes: > Bob Bell wrote: > > > ...something... > > When you say that "the function shouldn't be allowed to execute a single > instruction more", logging and asserting would be impossible. "Executing > as little code as possible", on the other hand, sounds reasonable to me > and eliminates the contradiction. I've only ever said the latter, FWIW. > I still cannot reconcile this guideline with Dave's point that > unwinding the stack is (almost always) wrong, but starting a > separate undo/recovery mechanism isn't. This may be due to > misunderstanding. If you find my point to be in contradiction with the goal of "executing as little code as possible," then there probably has been a misunderstanding. At the point you detect a violated precondition, there can usually only be partial recovery. You should do just enough work to avoid total catastrophe, if you can. At the point you detect a violated precondition, you don't have any way to ensure that the actions taken by stack unwinding will be minimal. On the other hand, if you have a separate mechanism for registering critical recovery actions -- and an agreement among components to use it -- you can invoke that, and avoid any noncritical actions. The other reason that unwinding is almost always wrong is that it is very prone to losing the information that a bug was detected, and allowing execution to proceed as though full recovery has occurred. All it takes is passing through a layer like this one: try { ...something that detects a precondition violation... } catch(e1& x) { translate_or_report(x); } catch(e2& x) { translate_or_report(x); } ... catch(...) { translate_or_report_unknown_error(); } which often occurs at subsystem boundaries. >> If you mean that you want to avoid crashes/ungraceful shutdowns when >> end-users use the system, I agree. > > That is what I am concerned about, and it's not just because of > trying to be nice to users. The software I am currently working on > is part of a larger system that has an abysmal record. The team was > recently confronted with an ultimatum set by the customer along the > lines of: you've got six more weeks until the final test; if there's > one single crash (and from a customer's view, this includes > assertions and the like), you're out - the end of a contract of > several million dollars as well as dozens of jobs. From this > perspective, one cannot help eyeing statements like "terminating the > program is good because it helps debugging" with a certain > reserve. Understandable. Your best option, if you have time for it -- and if a clean emergency shutdown will not be interpreted as a crash -- is to institute a recovery subsystem for critical things that must happen during emergency shutdown. In non-shipping code, asserts should immediately invoke the debugger, and then invoke emergency recovery and shutdown. In shipping code, obviously, there's no debugger. If you can't do that, then you may have to resort to using the exception mechanism in order to protect your jobs. However, in non-shipping code, asserts should *still* invoke the debugger immediately, and you should take care that you don't confuse unwinding from a precondition violation with "recovery." Your program is in a bad state, and if you continue after unwinding, you're doing so "on a wing and a prayer." Cross your fingers, get out your mojo hand, and light your voodoo candles. Good Luck. >> In practice this doesn't happen (at least, in my practice; can't speak >> for anyone else). Instead, liberal usage of assertions to trap >> precondition violations as bugs leads to finding and fixing a lot of >> bugs. Perhaps you should try it before deciding that it doesn't work. > > I am sorry if I should have created the impression that I have decided > the approach doesn't work. I *do* make liberal use of assertions. But > you have to take into account that the more liberally you use > assertions, the more likely it is that you err on the other side, i.e. > that you designate a condition as a result of a bug when in reality it > is a possible, if exotic program state. That can only happen if you assert some condition that isn't in the called function's set of documented preconditions. If the assertion matches the function's documentation, then it *is* catching a bug. >> I don't know why it should. I'm not programming with Eiffel, and as >> far as I know, neither are you, so why should it matter what >> "precondition" means in Eiffel? Lots of terms are used differently by >> the two camps. You don't seem to have trouble discussing exceptions, >> despite the fact that the term means different things in the two >> languages. > > The answer to this is easy: because, to the best of my knowledge, > the concept of Design by Contract originated in and is most > fervently advocated by the Eiffel camp. Althogh it is being > supported directly in Eiffel, it is abstract enough not to be tied > to that language. There may be differences in implementation, but > the fundamentals should be the same. If, however, there is a sound > argument why preconditions should be defined and/or handled > differently in C++, I would like to see it. As far as I can tell, the Eiffel camp has a similar understanding. Because the throw-in-response-to-precondition-violation behavior can be turned on and off globally, you basically can't count on it. Think of it as one possible expression of undefined behavior. In some languages, throwing an exception is basically the only way to get a debuggable stack trace. If that's the case in Eiffel, it would explain why they have the option to throw: it's as close as possible to invoking the debugger (perhaps it even does so). I should also point out that there's some variation among languages (and even among C++ compilers) in _when_ stack unwinding actually occurs. For example, in C++, if the exception is never caught, there may not ever be any unwinding (it's up to the implementation). In Python, no unwinding happens until the exception backtrace is _explicitly_ discarded or the next exception is thrown. I don't know about the details of Eiffel's exception mechanism, but all of these variations can have a major impact on the danger of throwing in response to a precondition violation. In other words, you may have to look a lot deeper to understand the proper relationship of Eiffel to C++. > I am well aware that certain technical terms mean different things > in different parts of the software engineering community, but I also > think that redefining terms gratuitously should be avoided. Absolutely. But I don't think there are as many different definitions as you seem to think there are. Have you found *any* definitions of "precondition" other than the Wikipedia one? I'm not talking about meanings of the word you infer from seeing it used in context. I'm talking about _definitions_. Also: read the section called "Run-time Assertion Monitoring" at docs.eiffel.com/eiffelstudio/general/guided_tour/language/tutorial-09.html AFAICT, that is in nearly perfect agreement with everything I've been saying. >> What's the alternative? Saying that the program's state is partially >> undefined? Or that some subset of the state is undefined, while the >> remainder of the state is well-defined? That kind of fuzzy thinking is >> something I don't understand. It often turns out to be wrong, and >> leads to missed opportunities to fix bugs. > > How about an example? Suppose you have a telephony application with > a phone book. The phone book module uses std::binary_search on a > std::vector. A precondition for this algorithm is that the range be > sorted. A bug causes an unsorted range to be passed. Leaving aside > the fact that detecting the violation of this precondition may be a > bit costly, how would you expect the application to react? ------reactions------- > Abort and thus disconnect the call in progress although it's only > the phone book that is broken? you don't know that ;-) > Exhibit undefined behaviour, such as (hopefully not more than) > displaying garbage? Notify the user of the problem but let him > finish the call? -------reactions-------- More on this in a moment. > Would the latter cause the precondition cease to be a precondition? I would just say yes, but that would be slightly too simple an answer. First of all, that the range is sorted is always a precondition of std::binary_search. Nothing you can do can ever change that, since you are not the author of std::binary_search or, more importantly, of its specification. If you write some function phone_book that accepts a range and calls binary_search, you have two choices: either make sortedness a precondition of phone_book, or do something in phone_book to detect non-sortedness, and document the well-defined behavior phone_book will give in response to that condition. If you make sortedness a precondition, you are allowed to skip detecting non-sortedness, or you can detect it and take whatever voodoo action you think gives you the least chance of getting fired... just so long as you remember you're in voodoo land now. On to the reactions. Which reaction is most appropriate for that particular application is outside my domain of expertise. I can tell you what my personal expectations are, but I'm not sure if that is much help. Of course I would not expect the condition to be violated in the first place, because it's the sort of condition that can usually be easily guaranteed with a little bit of care and logical deduction. I therefore wouldn't expect the application to insert explicit checks for it, so if the condition were somehow violated, I'd expect the application to do something crazy like displaying garbage or looping infinitely, etc. > This is not meant to be polemic; I am genuinely interested. > > As for fuzzy thinking, "something's amiss somewhere" sounds more fuzzy > to me than "something's amiss in this module/function". Yes, but you don't know that the latter is true. Thinking the latter is true when you only know the former is more fuzzy thinking. >> One other pragmatic reason to stop the program and fix the bug the >> moment the bug is detected is that you never know when the bug is >> going to recur and you'll get another opportunity. > > What kind of scenario do you have in mind? If your program aborts at a > remote user site, no immediate fixing is going to take place. I fully > agree that masking bugs and just plodding on is bad practice, I just > doubt that aborting is the only means of preventing it. > >> Precondition failures indicate bugs, and the right thing to do is fix >> the bug; just about the worst thing you could do is throw an >> exception, since throwing an exception is tantamount to ignoring the >> bug. > > Why you do you equate throwing exceptions with ignoring bugs? For what it's worth, _I_ don't equate those. However, throwing an exception can easily lead to ignoring a bug as I demonstrated above. -- Dave Abrahams Boost Consulting www.boost-consulting.com [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |