Prev: Looking for good source of information on Intel internal floating-point representations.
Next: MTPREDICTOR TRADING SOFTWARE 6.5 free software - www.top-tradesoft.com
From: Noob on 10 Feb 2010 07:22 Todd Bezenek wrote: > I am looking for a good reference which will allow me to understand > the issues involved with the precision of Intel floating-point > calculations. The main focus is the fact that Intel uses an internal > representation with more bits than the value will have when stored. > The questions I want to be able to answer via the reference include: > > 1. How/when does the representation change when I load a value from > memory into an x87 register, do a calculation on it, move it to > another register, store it back to memory. > > 2. The same as (1), but using the xmm (SSE) registers to do > computations with scalar floating-point values. > > 3. The same as (1, 2), but using the xmm (SSE) registers to do > computations with packed floating-point values. > > Thank you for any pointers to a good reference. Something I can find > on-line is preferred as always. comp.lang.asm.x86 is definitely worth your while. The Intel documents you're looking for are Intel Architectures Software Developer's Manuals http://www.intel.com/products/processor/manuals/index.htm You should probably focus on volume 1. The Optimization Reference Manual might also provide some insight. Regards.
From: Terje Mathisen "terje.mathisen at on 10 Feb 2010 10:21 Todd Bezenek wrote: > The questions I want to be able to answer via the reference include: > > 1. How/when does the representation change when I load a value from > memory into an x87 register, do a calculation on it, move it to > another register, store it back to memory. This depends on the setting of the fp control word: If set to double precision (instead of the default extended), each and every operation will perform rounding to the specified precision, but this only affects the mantissa, not the exponent which is still kept in the 15-bit extended format. The difference becomes visible if you do a set of calculations which will temporarily over- or under-flow a double precision variable, but subsequent operations bring it back into range. It is only when you store variables to memory that they get rounded to the actual double format, which also means that a value which will underflow in memory can keep full precision as long as it is stored in a register, but save/restore to memory will destroy it. I.e. Intel x87 is better, but definitely different. > > 2. The same as (1), but using the xmm (SSE) registers to do > computations with scalar floating-point values. Pretty close to IEEE double/single semantics. (There might be some details related to handling of special values which I've forgotten.) > > 3. The same as (1, 2), but using the xmm (SSE) registers to do > computations with packed floating-point values. Afaik SSE packed and scalar are identical. > > Thank you for any pointers to a good reference. Something I can find > on-line is preferred as always. As noob said, the Intel Arch manuals are available online. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen "terje.mathisen at on 10 Feb 2010 14:57 Gavin Scott wrote: > Terje Mathisen<"terje.mathisen at tmsw.no"> wrote: >> The difference becomes visible if you do a set of calculations which >> will temporarily over- or under-flow a double precision variable, but >> subsequent operations bring it back into range. > >> It is only when you store variables to memory that they get rounded to >> the actual double format, which also means that a value which will >> underflow in memory can keep full precision as long as it is stored in a >> register, but save/restore to memory will destroy it. > > What happens on a context switch? Nothing: The task save/restore will always handle the full 80-bit format, independent of the control word setting. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Noob on 11 Feb 2010 04:54
Todd Bezenek wrote: > I have looked at the arch. manuals, and there is not a clear > explanation about the 80-bit internal representation and when the > conversion to a shorter representation happens, and how this changes > when using SSE registers. The information is probably there, but not > in a section dedicated to explaining this detail (that I could find). CHAPTER 8 PROGRAMMING WITH THE X87 FPU 8.1.2 x87 FPU Data Registers The x87 FPU data registers (shown in Figure 8-1) consist of eight 80-bit registers. Values are stored in these registers in the double extended-precision floating-point format shown in Figure 4-3. 8.1.5 x87 FPU Control Word The 16-bit x87 FPU control word (see Figure 8-6) controls the precision of the x87 FPU and rounding method used. It also contains the x87 FPU floating-point exception mask bits. The control word is cached in the x87 FPU control register. The contents of this register can be loaded with the FLDCW instruction and stored in memory with the FSTCW/FNSTCW instructions. 8.1.5.2 Precision Control Field The precision-control (PC) field (bits 8 and 9 of the x87 FPU control word) determines the precision (64, 53, or 24 bits) of floating-point calculations made by the x87 FPU (see Table 8-2). The default precision is double extended precision, which uses the full 64-bit significand available with the double extended-precision floating-point format of the x87 FPU data registers. This setting is best suited for most applications, because it allows applications to take full advantage of the maximum precision available with the x87 FPU data registers. The double precision and single precision settings reduce the size of the significand to 53 bits and 24 bits, respectively. These settings are provided to support IEEE Standard 754 and to provide compatibility with the specifications of certain existing programming languages. Using these settings nullifies the advantages of the double extended-precision floating-point format's 64-bit significand length. When reduced precision is specified, the rounding of the significand value clears the unused bits on the right to zeros. The precision-control bits only affect the results of the following floating-point instructions: FADD, FADDP, FIADD, FSUB, FSUBP, FISUB, FSUBR, FSUBRP, FISUBR, FMUL, FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR, FDIVRP, FIDIVR, and FSQRT. 8.1.5.3 Rounding Control Field The rounding-control (RC) field of the x87 FPU control register (bits 10 and 11) controls how the results of x87 FPU floating-point instructions are rounded. See Section 4.8.4, "Rounding," for a discussion of rounding of floating-point values; See Section 4.8.4.1, "Rounding Control (RC) Fields", for the encodings of the RC field. 8.2 X87 FPU DATA TYPES The x87 FPU recognizes and operates on the following seven data types (see Figures 8-13): single-precision floating point, double-precision floating point, double extended-precision floating point, signed word integer, signed doubleword integer, signed quadword integer, and packed BCD decimal integers. With the exception of the 80-bit double extended-precision floating-point format, all of these data types exist in memory only. When they are loaded into x87 FPU data registers, they are converted into double extended-precision floating-point format and operated on in that format. As a general rule, values should be stored in memory in double-precision format. This format provides sufficient range and precision to return correct results with a minimum of programmer attention. The single-precision format is useful for debugging algorithms, because rounding problems will manifest themselves more quickly in this format. The double extended-precision format is normally reserved for holding intermediate results in the x87 FPU registers and constants. Its extra length is designed to shield final results from the effects of rounding and overflow/underflow in intermediate calculations. However, when an application requires the maximum range and precision of the x87 FPU (for data storage, computations, and results), values can be stored in memory in double extended-precision format. |