The D Programming Language [C++]

Prev: how can operator new overrun memory?!
Next: Why no std::back_insert_iterator::value_type?

From: Gabriel Dos Reis on 30 Nov 2006 10:27

"Peter Dimov" <pdimov(a)gmail.com> writes:

| Walter Bright wrote:
|
| > Here's what Digital Mars C++ does, which implements C99 complex numbers:
| >
| > ------------------ program ------------------
| > #include <complex.h>
| >
| > complex long double f( complex long double c )
| > {
| > return c;
| > }
| > ------------------- asm ---------------------------
| > ?f@@YA_W_W@Z:
| > fld tbyte ptr 4[ESP]
| > fld tbyte ptr 0Eh[ESP]
| > ret
| > --------------------------------------------------
|
| So your point is that having complex as a built-in allows you to define
| an ABI that returns it in ST0:ST1 (but you still pass it on the stack.)

I believe I must again inject facts into this surreal discussion.
Not just because a type is blessed built-in by the language definition
means that compilers will effectively use registers in passing arguments
or returning values. The 32-bit PowerPC processor specific ABI

http://www.opensolaris.org/os/community/power_pc/powerpc_doc_library/elfspec_ppc.pdf

specifies that (page 3-19) that when values of the built-in type "long
double" are passed "by value" in function call, the compiler must
pass that value "by reference" and introduce a copy (on stack) where
necessary to enforce pass-by=value semantics.

Similarly, when a value of the built-in type "long double" is
returned, the ABI specifies that the compiler should use the stack and
instead return the address of the value on stack (page 3-22)

Values of type long double and structures or unions that do not meet
the requirements for being returned in registers are returned in a
storage buffer allocated by the caller. The address of
this buffer is passed as a hidden argument in r3 as if it were the
first argument, causing gr in the argument passing algorithm above
to be initialized to 4 instead of 3.

That also suggests that the 32-bit PowerPC ABI specifies that values
of UDT can be returned in registers when they meet specific
conditions. Indeed, on page 3-22:

A structure or union whose size is less than or equal to 8 bytes
shall be returned in r3 and r4, as if it were first stored in an
8-byte aligned memory area and then the low-addressed word were
loaded into r3 and the high-addressed word into r4. Bits beyond the
last member of the structure or union are not defined.

This psABI is a living example of an ABI which does not guarantee
register use for passing and returning values of built-in types, but
which use registers to return values of structure types if they
are small enough.

| I admit that it's unlikely for any ABI to return UDTs in registers.

Fortunately, not all compilers out there made the decision to actively
unsupport abstractions like the Digital Mars compiler.

The Itanium Software Conventions and Runtime Architecture Guide
specifies (page 8-13) that certain aggregates are returned in
registers. That is lift to C++ UDT if they don't have non-trivial
*copy-constructor* or *destructor*. See the "common C++ ABI"
specification

http://www.codesourcery.com/cxx-abi/abi.html#calls

In general, C++ return values are handled just like C return
values. This includes class type results returned in
registers. However, if the return value type has a non-trivial copy
constructor or destructor, the caller allocates space for a
temporary, and passes a pointer to the temporary as an implicit first
parameter preceding both the this parameter and user parameters. The
callee constructs the return value into this temporary.

A result of an empty class type will be returned as though it were a
struct containing a single char, i.e. struct S { char c; };. The
actual content of the return register is unspecified. On Itanium, the
associated NaT bit must not be set.

--
Gabriel Dos Reis
gdr(a)integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Andrei Alexandrescu (See Website For Email) on 30 Nov 2006 15:35

James Kanze wrote:
> I don't know quite what different definitions we could be using.
> Undefined behavior occurs when the language specification places
> no definition on the behavior. I don't know how you can easily
> search for it, because it is the absence of a definition. Java
> (and most other languages) don't use the term, or even specify
> explicitely what they don't specify. So the reponse is rather
> the opposite: unless you can find some statement in the language
> specification which defines this behavior, it is undefined
> behavior.

I was hoping I'd be saved of searching online docs, but now it looks
like I had to, so so be it.

There might be a terminology confusion here, which I'd like to clear
from the beginning:

1. A program "has undefined behavior" = effectively anything could
happen as the result of executing that program. The metaphor with the
demons flying out of one's nose comes to mind. Anything.

2. A program "produces an undefined value" = the program could produce
an unexpected value, while all other values, and that program's
integrity, are not violated.

The two are fundamentally different because in the second case you can
still count on objects being objects etc.; the memory safety of the
program has not been violated. Therefore the program is much easier to
debug.

C++ allows programs with (1). We might also consider that it allows
programs with (2) under the name of "unspecified behavior" or
"implementation-dependent behavior". (There would be a subtle difference
there, but passons.)

My current understanding is that Java programs never exhibit (1), and
might exhibit (2) only on values that can't be read atomically (which
remarkably are never pointers). To find out whether my understanding is
correct, I looked up the language spec, which says after a discussion of
the memory model (see
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.3):

"Therefore, a data race cannot cause incorrect behavior such as
returning the wrong length for an array."

Later on that page, there is a section "17.7 Non-atomic Treatment of
double and long" that discusses the exact issue we are talking about here.

"Some implementations may find it convenient to divide a single write
action on a 64-bit long or double value into two write actions on
adjacent 32 bit values. For efficiency's sake, this behavior is
implementation specific; Java virtual machines are free to perform
writes to long and double values atomically or in two parts.

For the purposes of the Java programming language memory model, a single
write to a non-volatile long or double value is treated as two separate
writes: one to each 32-bit half. This can result in a situation where a
thread sees the first 32 bits of a 64 bit value from one write, and the
second 32 bits from another write. Writes and reads of volatile long and
double values are always atomic. Writes to and reads of references are
always atomic, regardless of whether they are implemented as 32 or 64
bit values.

VM implementors are encouraged to avoid splitting their 64-bit values
where possible. Programmers are encouraged to declare shared 64-bit
values as volatile or synchronize their programs correctly to avoid
possible complications."

This section can be understood only if we know what a Java program does
once it's read an invalid (say, NaN) value. Will it crash?

Andrei

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Fred Long on 30 Nov 2006 15:40

James Kanze wrote:
> ...
>>>> Writing to a double while another thread is reading it is
>>>> undefined behavior in Java.
>
>>> I've ran a number of searches for that ("java undefined behavior
>>> double", "java undefined behavior threads double" etc.), no avail. I'd
>>> be glad if you provided a reference. Thanks!
>
>>> Maybe (also) we're using slightly different definitions for undefined
>>> behavior?
>
>> I am still waiting for a response on this issue, or a retraction of the
>> initial statement.
>
> Sorry, I missed your previous posting. (I've not been following
> this thread too closely, since D is not a topic which interests
> me much.)
>
> I don't know quite what different definitions we could be using.
> Undefined behavior occurs when the language specification places
> no definition on the behavior. I don't know how you can easily
> search for it, because it is the absence of a definition. Java
> (and most other languages) don't use the term, or even specify
> explicitely what they don't specify. So the reponse is rather
> the opposite: unless you can find some statement in the language
> specification which defines this behavior, it is undefined
> behavior.
>
> Actually, I think there are even more cases, involving multiple
> writes to different variables. But the case of double or long
> is flagrant, since the language specification does not require
> the writes to be atomic.
> ...

See: http://java.sun.com/docs/books/jls/third_edition/html/memory.html

Section 17.7 Non-atomic Treatment of double and long

Fred Long.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Gabriel Dos Reis on 30 Nov 2006 22:22

"Andrei Alexandrescu (See Website For Email)"
<SeeWebsiteForEmail(a)erdani.org> writes:

[...]

| There might be a terminology confusion here, which I'd like to clear
| from the beginning:
|
| 1. A program "has undefined behavior" = effectively anything could
| happen as the result of executing that program. The metaphor with the
| demons flying out of one's nose comes to mind. Anything.

Why is not that the value of the computation?

| 2. A program "produces an undefined value" = the program could produce
| an unexpected value, while all other values, and that program's
| integrity, are not violated.
|
| The two are fundamentally different because in the second case you can
| still count on objects being objects etc.;

I don't see anything fundamental in that difference.

--
Gabriel Dos Reis
gdr(a)integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: David Abrahams on 30 Nov 2006 22:25

"Andrei Alexandrescu (See Website For Email)"
<SeeWebsiteForEmail(a)erdani.org> writes:

> There might be a terminology confusion here, which I'd like to clear
> from the beginning:
>
> 1. A program "has undefined behavior" = effectively anything could
> happen as the result of executing that program. The metaphor with the
> demons flying out of one's nose comes to mind. Anything.
>
> 2. A program "produces an undefined value" = the program could produce
> an unexpected value, while all other values, and that program's
> integrity, are not violated.
>
> The two are fundamentally different because in the second case you can
> still count on objects being objects etc.; the memory safety of the
> program has not been violated. Therefore the program is much easier to
> debug.

Seriously?

IME you're at least likely to crash noisily close to the undefined
behavior. If you make everything defined the program necessarily
soldiers on until one of your own internal checks is able to notice
that something went wrong. Or am I missing something?

I don't have any real experience with Java, but Python generally
exhibits Java-like behavior, and I don't find it easier to debug than
C++.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

First | Prev | Next | Last
Pages: 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
Prev: how can operator new overrun memory?!
Next: Why no std::back_insert_iterator::value_type?