From: Andrew Reilly on 28 Jun 2010 08:11 Hi Jean-Marc, On Mon, 28 Jun 2010 11:10:16 +0200, Jean-Marc Bourguet wrote: > I think this is a case of different people wanting different things for > C. The end result in gcc (options -fwrapv/-ftrapv allowing to ask for > wrap around/trapping instead of letting it uses all the lattitude open > by the standard) would be good if only -ftrapv worked reliably... It is > more difficult to test if -fwrapv work -- on the one hand it is probably > more tested than -ftrapv (it is implicit for Java according to the > documentation), on the other hand the fact that -ftrapv doesn't isn't a > confidence builder. Thank-you, thank-you! I can't imagine what the optimiser is doing to code without these flags, by default, but as long as you can make it behave "sanely" (by my standards!) then I'm happy indeed. I apologise for not having found this switch in the manuals myself, before now. I guess "fast and mostly working" is good enough for most people/ applications... Cheers, -- Andrew
From: EricP on 29 Jun 2010 16:24 Andy 'Krazy' Glew wrote: > On 6/26/2010 10:40 AM, EricP wrote: >> Andy 'Krazy' Glew wrote: >>> >>> Some examples: >>> >> >> You are really doing signed 9 bit arithmetic there, >> then casting the s9 result back to either a u8 or s8 type. >> Whether there is an overflow or not depends on the result >> type and the value. > > Exactly. > > And if I was doing the same in 16, 32, 64 bit arithmetic, I would be > essentially doing the intermediate calculations in 17, 33, 65 bits, and > casting back. > > And, since support for 9, 17, 33, 65 bits is not ubiquitous, and since > doubling the width to 8, 32, 64 bits, which is ubiquitous, tends to cost > a lot in performance (let alone the fact that extending precision from > 64 bits, whether to 65 bits or 128 bits, is not ubiquitous), I am > looking for expressions that allow detection of overflow based on modulo > arithmetic. > > Although I tend to use a C-like notation to express this, I am NOT > thinking in terms of C. > > Or, of you will - imagine that everything has been cast to the > appropriate unsigned width. Since C defines unsigned as modulo > arithmetic, that should not be subject to compiler transformations that > make signed overflows undefined. So we only have to calculate a single ninth sign bit value manually. If you want the result type unsigned u_sum = unsigned u_a + signed s_b then the result overflows if the sign is set at the end of the calculation. u_a always zero extends so the initial value of the sign bit is the sign of s_b, so sign = (s_b < 0); The sign will toggle if there is a carry out of the sum, so carry = (u_a + s_b) < u_a; If s_b < 0 then it will still be set if there is no carry, or if s_b >= 0 then it will be set if there is a carry, Putting it together gives overflow = (s_b < 0) != ((u_a + s_b) < u_a); or alternatively overflow = ((s_b < 0) ^ ((u_a + s_b) < u_a)) != 0; But ((u_a + s_b) < u_a) is the same as (((u_a + s_b) - u_a) < 0) so just using the sign of the result we get u_sum = u_a + s_b; overflow = (s_b ^ (u_sum - u_a)) < 0; (unless I've made a boo boo someplace) Eric
From: EricP on 30 Jun 2010 12:03 Tim McCaffrey wrote: > > It is the same as the 32 bit version. > > What I found annoying when I was doing code generation for x64, is that 99% of > the documentation is wrong about where the REX byte goes. I had to > disassemble the code via Eclipse to figure out what was correct. (Then I had > to argue with a co-worker about it...) Both Intel and AMD documentation says REX comes after any optional legacy prefixes and before the opcode. Is that not correct? Eric
From: Tim McCaffrey on 30 Jun 2010 15:21 In article <8WJWn.3131$cO.321(a)newsfe09.iad>, ThatWouldBeTelling(a)thevillage.com says... > >Tim McCaffrey wrote: >> >> It is the same as the 32 bit version. >> >> What I found annoying when I was doing code generation >> for x64, is that 99% of >> the documentation is wrong about where the REX byte >> goes. I had to >> disassemble the code via Eclipse to figure out what was >> correct. (Then I had >> to argue with a co-worker about it...) > >Both Intel and AMD documentation says REX comes after >any optional legacy prefixes and before the opcode. >Is that not correct? > >Eric > > My problem was that it says (somewhere, can't find it right now) that 0x66, 0xF2, & 0xF2 should be considered part of the opcode for those instructions that use it, not a prefix. IOW, for MOV AX,BX 0x66 is a prefix, but for MOVQ mem64,xmm0 0x66 is part of the opcode. Well, that isn't the way it works. If you wrote your code generator to just emit 0x66 0x0F 0xD6 <mod r/m bytes>, and you just want to prefix it with the REX byte when you use xmm8..15, too bad. You have to stick the REX byte between 0x66 and 0x0F. Again, I can't find it right now, but I remember examples in both the Intel & AMD documentation that showed (mostly) the wrong way to add the REX prefix, and once where they had it correct. Anyway, GCC & Linux figured it out before I had to, so the disassembly showed me very quickly when I had it wrong. - Tim
From: Tim McCaffrey on 1 Jul 2010 15:57
In article <30028ecd-f025-4c05-bd8a-93c99e00a8a8(a)a30g2000yqn.googlegroups.com>, MitchAlsup(a)aol.com says... >No assumption is needed on 1s-complement or 2s-complement machines. >{Does anyone know of a machine using integer signed-magnitude that is >still existing?} > Why, yes, the Unisys Clearpath Libra systems. (aka. MCP systems). And they do all this stuff. Bounds checking. Integer overflow detection. and (x+c) < x will never work on an MCP system. Except in C, because the compiler emulates 2s-complement. - Tim |