From: Andrei Alexandrescu (See Website For Email) on 1 Dec 2006 02:29 David Abrahams wrote: > "Andrei Alexandrescu (See Website For Email)" > <SeeWebsiteForEmail(a)erdani.org> writes: > > >>There might be a terminology confusion here, which I'd like to clear >>from the beginning: >> >>1. A program "has undefined behavior" = effectively anything could >>happen as the result of executing that program. The metaphor with the >>demons flying out of one's nose comes to mind. Anything. >> >>2. A program "produces an undefined value" = the program could produce >>an unexpected value, while all other values, and that program's >>integrity, are not violated. >> >>The two are fundamentally different because in the second case you can >>still count on objects being objects etc.; the memory safety of the >>program has not been violated. Therefore the program is much easier to >>debug. > > > Seriously? > > IME you're at least likely to crash noisily close to the undefined > behavior. If you make everything defined the program necessarily > soldiers on until one of your own internal checks is able to notice > that something went wrong. Or am I missing something? I think it's one thing to have a wrong numeric value and one very different thing to have a program in which all hell breaks looks due to random overwriting of memory. > I don't have any real experience with Java, but Python generally > exhibits Java-like behavior, and I don't find it easier to debug than > C++. Well the only thing I can add is that in my limited experience, debugging Java programs is much easier because there's never the case that a dangling pointer misteriously overwrites some object it wasn't supposed to. I remember __to this day__ a night in 1998 when a colleague and myself spent one night figuring out a completely weird exception being thrown (in a C++ program) under very complex circumstances - just because of a misfit memcpy() in a completely different and unrelated part of the program. Now that I think of that, I remember a few others. Probably I'd remember even more under hypnosis :o). To tell the truth, I also remember of a JVM bug causing me a few gray hairs... :o) Andrei -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Andrei Alexandrescu (See Website For Email) on 1 Dec 2006 02:27 Gabriel Dos Reis wrote: > "Andrei Alexandrescu (See Website For Email)" > <SeeWebsiteForEmail(a)erdani.org> writes: > > [...] > > | There might be a terminology confusion here, which I'd like to clear > | from the beginning: > | > | 1. A program "has undefined behavior" = effectively anything could > | happen as the result of executing that program. The metaphor with the > | demons flying out of one's nose comes to mind. Anything. > > Why is not that the value of the computation? > > | 2. A program "produces an undefined value" = the program could produce > | an unexpected value, while all other values, and that program's > | integrity, are not violated. > | > | The two are fundamentally different because in the second case you can > | still count on objects being objects etc.; > > I don't see anything fundamental in that difference. It's very simple. In one case you have a program that preserves its own guarantees (e.g. there's no random overwriting of memory), but which has one numerical value that's invalid; that can't corrupt memory because there's no pointer forging. In the other case you can't count on pretty much anything. Explaining this any better is beyond my abilities. Andrei -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Jean-Marc Bourguet on 1 Dec 2006 10:32 "Andrei Alexandrescu (See Website For Email)" <SeeWebsiteForEmail(a)erdani.org> writes: > Well the only thing I can add is that in my limited experience, > debugging Java programs is much easier because there's never the case > that a dangling pointer misteriously overwrites some object it wasn't > supposed to. Instead you are writing to an object which was supposed to be out of existence for a long time. In my experience, that give you the same kind of elusive bugs. Excepted that purify can't help you and that random behaviour including crashes are replaced by deterministic, often plausible but wrong results. Yours, -- Jean-Marc [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: James Kanze on 1 Dec 2006 14:49 Andrei Alexandrescu (See Website For Email) wrote: > James Kanze wrote: > > I don't know quite what different definitions we could be using. > > Undefined behavior occurs when the language specification places > > no definition on the behavior. I don't know how you can easily > > search for it, because it is the absence of a definition. Java > > (and most other languages) don't use the term, or even specify > > explicitely what they don't specify. So the reponse is rather > > the opposite: unless you can find some statement in the language > > specification which defines this behavior, it is undefined > > behavior. > I was hoping I'd be saved of searching online docs, but now it looks > like I had to, so so be it. > There might be a terminology confusion here, which I'd like to clear > from the beginning: > 1. A program "has undefined behavior" = effectively anything could > happen as the result of executing that program. The metaphor with the > demons flying out of one's nose comes to mind. Anything. The example is meant to be taken humorously. Surely you don't think that the C++ standard would be improved, and that we would have eliminated all "undefined behavior", in any useful, realistic sense, if we added a clause to the standard saying that "in no case is a program allowed to cause demons to fly out of the programmers nose." In practice, "undefined behavior" is always somewhat restricted; in non-privileged code under Unix or Windows, for example, you may get a core dump, but you won't corrupt the system or even reformat the hard drive. The C++ standard prefers to not give even these guarantees, because C++ is conceived for use in areas where they don't apply---if you have undefined behavior in a device driver, you might end up reformatting the hard disk. Java can make concrete, specific limits, because it doesn't try to be usable in such contexts. From the point of view of someone developping application (non-privileged) software, C++ has some limits as well. That doesn't mean that it doesn't have undefined behavior in such cases, at least not for any useful meaning of the expression. > 2. A program "produces an undefined value" = the program could produce > an unexpected value, while all other values, and that program's > integrity, are not violated. In practice, in real programs, it's much more complicated. "Values" interact, and the results of modifying values in the wrong order, and seeing those modifications, can result in behavior that the programmer cannot foresee. Not limited to unexpected values, but including unexpected exceptions, etc. If you violate the rules in Java, you cannot count on much in practice, any more than if you violate them in C++. (You can count on NOT getting a core dump, of course. Which I would consider a defect, more than an advantage.) > The two are fundamentally different because in the second case you can > still count on objects being objects etc.; the memory safety of the > program has not been violated. Therefore the program is much easier to > debug. Memory safety is only one part of "undefined behavior". Not crashing when you have a serious error makes the program much harder to debug---if there's a weakness here in C++, it's that the crash is not guaranteed, not that it isn't forbidden. (But pratically speaking, guaranteeing the crash in such cases is not implementable at reasonable cost.) > C++ allows programs with (1). We might also consider that it allows > programs with (2) under the name of "unspecified behavior" or > "implementation-dependent behavior". (There would be a subtle difference > there, but passons.) There's a radical difference. As a pratical programmer, there's really not any significant difference between "unspecified behavior" and "undefined behavior", unless there are serious restrictions on "unspecified". Whereas I use implementation defined behavior in just about every program I write. > My current understanding is that Java programs never exhibit (1), If you mean that Java guarantees that a program will never make demons fly out of your nose, you're probably right. If you mean that the program will behave in a reliable and predictable manner regardless of what I've coded, you're definitely wrong. The question is, I think, just how unreliable and unpredictable it has to be before we speak of "undefined behavior". I would say that there are certain cases involving threading where the behavior is so unreliable and unpredictable that I would consider it "undefined". Whether you agree with the actual word is really not the issue---the point is that for a practical programmer, you're faced with the same issues. (Don't get me wrong---I think there is far too much of this problem in C++, and Java does handle it significantly better. The only cases I can think of in Java where it is a problem do involve threading, which is an extremely complex issue; in C++, you can get similar problems with even the simplest, single threaded code, e.g. by returning a pointer or a reference to a local variable. Just because I refuse to accord Java the absolute doesn't mean that I don't recognize that it represents orders of magnitude improvement in most cases.) > and > might exhibit (2) only on values that can't be read atomically (which > remarkably are never pointers). > To find out whether my understanding is > correct, I looked up the language spec, which says after a discussion of > the memory model (see > http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.3): > "Therefore, a data race cannot cause incorrect behavior such as > returning the wrong length for an array." Which is a true, but it is a useless guarantee. I can get the wrong length from a java.util.Vector. The possibly useful guarantee is that if I use the wrong length, I still have defined behavior. It would be even more useful if the guarantee was sensible; if the code was guaranteed to crash, instead of just throwing an exception which can be caught and ignored. (At least in my field of endevour. I can quite understand that there are cases where the exception, if it is caught at a high enough level, might be appropriate. The trick would be to define a type of exception which can only be caught at a high enough level, so that lower level code can't mask its errors and return wrong results.) > Later on that page, there is a section "17.7 Non-atomic Treatment of > double and long" that discusses the exact issue we are talking about here. > "Some implementations may find it convenient to divide a single write > action on a 64-bit long or double value into two write actions on > adjacent 32 bit values. For efficiency's sake, this behavior is > implementation specific; Java virtual machines are free to perform > writes to long and double values atomically or in two parts. > For the purposes of the Java programming language memory model, a single > write to a non-volatile long or double value is treated as two separate > writes: one to each 32-bit half. This can result in a situation where a > thread sees the first 32 bits of a 64 bit value from one write, and the > second 32 bits from another write. Writes and reads of volatile long and > double values are always atomic. Writes to and reads of references are > always atomic, regardless of whether they are implemented as 32 or 64 > bit values. > VM implementors are encouraged to avoid splitting their 64-bit values > where possible. Programmers are encouraged to declare shared 64-bit > values as volatile or synchronize their programs correctly to avoid > possible complications." > This section can be understood only if we know what a Java program does > once it's read an invalid (say, NaN) value. Will it crash? Can the VM avoid crashing, if the OS decides that that is what it wants to do? More to the point, does the fact that a Java program cannot crash (IF that is the case) mean that Java has no undefined behavior, or is it more or less a specious guarantee, with about as much meaning as if C++ added a guarantee that no C++ program could make demons fly out of your nose. Do my programs suddenly loose all undefined behavior if I set SIGILL, SIGBUS, SIGSEGV and SIGFPE to ignore at the start? -- James Kanze (GABI Software) email:james.kanze(a)gmail.com Conseils en informatique orient�e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34 -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: James Kanze on 1 Dec 2006 14:50
Andrei Alexandrescu (See Website For Email) wrote: > Gabriel Dos Reis wrote: > > "Andrei Alexandrescu (See Website For Email)" > > <SeeWebsiteForEmail(a)erdani.org> writes: > > [...] > > | There might be a terminology confusion here, which I'd like to clear > > | from the beginning: > > | 1. A program "has undefined behavior" = effectively anything could > > | happen as the result of executing that program. The metaphor with the > > | demons flying out of one's nose comes to mind. Anything. > > Why is not that the value of the computation? > > | 2. A program "produces an undefined value" = the program could produce > > | an unexpected value, while all other values, and that program's > > | integrity, are not violated. > > | The two are fundamentally different because in the second case you can > > | still count on objects being objects etc.; > > I don't see anything fundamental in that difference. > It's very simple. In one case you have a program that preserves its own > guarantees (e.g. there's no random overwriting of memory), but which has > one numerical value that's invalid; that can't corrupt memory because > there's no pointer forging. In the other case you can't count on pretty > much anything. I think we understand this difference. I, at least, also recognize it as a positive point. The problem is understanding just to what point it's relevant. To come back to a point you made: Java guarantees that you cannot get the wrong length for an array. Fine, but unless it can guarantee the same thing for Vector, or other similar types, has it really bought me anything? Individual values don't exist in a vacuum; they exist in relationships to other values. The effect is just as undefined as in C++, in practice. Java certainly makes more guarantees than C++, and it also provides defined means of detecting a number of errors, but it isn't 100%. As far as I know, 100% isn't possible. -- James Kanze (GABI Software) email:james.kanze(a)gmail.com Conseils en informatique orient�e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34 -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |