From: pnachtwey on 7 Jul 2010 01:25 On Jul 6, 3:33 pm, HardySpicer <gyansor...(a)gmail.com> wrote: > For floating point arithmetic how much faster is an add/subtract than > a multiply/accumulate? (percentage wise). > > Hardy I have experience with the C32 and C33. Multiplies and adds or multiplies and subtracts can happen at a rate of 1 per clock cycle but that doesn't mean the complete in that time as other have mentioned. The C32 and C33 can sometimes do two floating point operations but usually one is fetching from memory. The big enemy is not multiplies, adds or subtracts but divides. Also pipe line stalls due to getting and storing data back to memory. To estimate time I usually count memory cycles. Peter Nachtwey
From: Vladimir Vassilevsky on 7 Jul 2010 11:31 pnachtwey wrote: > On Jul 6, 3:33 pm, HardySpicer <gyansor...(a)gmail.com> wrote: > >>For floating point arithmetic how much faster is an add/subtract than >>a multiply/accumulate? (percentage wise). >> >>Hardy > > I have experience with the C32 and C33. Multiplies and adds or > multiplies and subtracts can happen at a rate of 1 per clock cycle but > that doesn't mean the complete in that time as other have mentioned. > The C32 and C33 can sometimes do two floating point operations but > usually one is fetching from memory. The big enemy is not multiplies, > adds or subtracts but divides. IIRC there is no penalty for floating point division in Intel P5+ CPUs; with their huge pipelines all arithmetic operations have the same cost. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
From: Manny on 7 Jul 2010 18:45 On Jul 6, 11:33 pm, HardySpicer <gyansor...(a)gmail.com> wrote: > For floating point arithmetic how much faster is an add/subtract than > a multiply/accumulate? (percentage wise). > > Hardy It depends on your code, compiler, and architecture. Best practice is to measure this statistically. I tend to think architecture all the time. Sometimes a memory-memory instruction can make all the difference in the world and even with increased compiler smartness, there is no substitute to human prudence. Because nothing can understand exactly what you'r after and what you can do with or without but you. -Momo
From: Manny on 7 Jul 2010 19:24 On Jul 7, 11:45 pm, Manny <mlou...(a)hotmail.com> wrote: > On Jul 6, 11:33 pm, HardySpicer <gyansor...(a)gmail.com> wrote: > > > For floating point arithmetic how much faster is an add/subtract than > > a multiply/accumulate? (percentage wise). > > > Hardy > > It depends on your code, compiler, and architecture. Best practice is > to measure this statistically. > > I tend to think architecture all the time. Sometimes a memory-memory > instruction can make all the difference in the world and even with > increased compiler smartness, there is no substitute to human > prudence. Because nothing can understand exactly what you'r after and > what you can do with or without but you. > > -Momo Ah. Well that was in reply to other posts rather than your original. If you'r building a case against something, power might be of relevance here. -Momo
From: bryant_s on 8 Jul 2010 14:10 >For floating point arithmetic how much faster is an add/subtract than >a multiply/accumulate? (percentage wise). > > >Hardy > The previous replies are correct if your metric is programmable processer clock cycles. In the hardware - a floating point number consists of a mantissa (normalized fractional portion) and an exponent (the power of 2 of the number). When multiplying, the two mantissas are multiplied in a fashion similiar to fixed-point multiplies. The exponents are added. The result is then adjusted in its exponent to re-normalize the mantissa. The mantissa generally is normalized to be on [0.5, 1.0), meaning two mantissas multiplied together will range on [0.25, 1.0), meaning to re-normalize this result back to the the [0.5, 1.0) range, there can be an extra shift of 1 bit (i.e. added to the resultant exponent). However - standard floating point multipliers also check for floating-point overflow (exponent too large) and zero. This adds another level of logic at the output. So - the mantissa multiply will be roughly of (relative) complexity M^2, where M = number of mantissa bits. The exponent add is of (relative) complexity N, where N = number of exponent bits. The single-bit shift, depending on how it's done, can be extremely simple, but let's call it complexity N because of the exponent decrement. For the addition - this requires that the two numbers be adjusted so they have the same exponent. This requires a compare of the two exponents (complexity N), a shift of the smaller number to match the larger number (complexity 2M), then an add of the mantissas (complexity M), a small-shift adjustment of the result (complexity N), plus the misc logic to check overflow and zero. So, ignoring the output checking logic, a VERY rough estimate is that a floating point multiply is of complexity (M^2 + N + N). A floating point add is of complexity (N + 2M + M + N). Based on your specific floating point format, you can then calculate your percentage comparison. That being said, there are tricks to simplify this. For instance, the final single-bit adjust of the multiply output can be incorporated into the exponent add with some look ahead logic. I also add the caveat that I am making a gross assumption that size / # of gates <--> delay. Bryant Sorensen DSP Platforms Starkey Laboratories
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: save your money and open all blocked sites now Next: pulsewidth and bandwidth |