From: onkars on 6 May 2010 01:27 Hi, I am a student working with the Xilinx C model of the FFT. Following are my settings: Pipelined arch. (this restricts the scaling to be applied only after every pair of radix-2 butterflies) Precision of 8 bits. Twiddle precision 8 bits. FFT size 1024. If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide by 4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this possible? -- i.e. outputs are getting too scaled. If I use block floating -- I get much better results (close to the floating point golden outputs). A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every pair of Radix2 BFs) causes overflow. Other than using block floating OR increasing my precision -- is there any other way of achieving better results (not so many 0s)? I also want my design to be protected from overflow under any input conditions (assume this is general purpose) Thank you.
From: Tim Wescott on 6 May 2010 12:41 onkars wrote: > Hi, I am a student working with the Xilinx C model of the FFT. Following > are my settings: > > Pipelined arch. (this restricts the scaling to be applied only after every > pair of radix-2 butterflies) > > Precision of 8 bits. > > Twiddle precision 8 bits. > > FFT size 1024. > > > If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide by > 4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this > possible? -- i.e. outputs are getting too scaled. > > If I use block floating -- I get much better results (close to the floating > point golden outputs). > > A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every > pair of Radix2 BFs) causes overflow. > > > Other than using block floating OR increasing my precision -- is there any > other way of achieving better results (not so many 0s)? I also want my > design to be protected from overflow under any input conditions (assume > this is general purpose) > > Thank you. I simply don't see, for all but some predefined signal with known characteristics, how a 1024 bin FFT is going to benefit you in any way with such low precision. As a rule of thumb, to catch everything that's going on you should have an output precision that's 10 bits deeper than your input -- so 8 bits out is just insufficient, any way you slice it. -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com
From: onkars on 6 May 2010 13:26 >onkars wrote: >> Hi, I am a student working with the Xilinx C model of the FFT. Following >> are my settings: >> >> Pipelined arch. (this restricts the scaling to be applied only after every >> pair of radix-2 butterflies) >> >> Precision of 8 bits. >> >> Twiddle precision 8 bits. >> >> FFT size 1024. >> >> >> If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide by >> 4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this >> possible? -- i.e. outputs are getting too scaled. >> >> If I use block floating -- I get much better results (close to the floating >> point golden outputs). >> >> A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every >> pair of Radix2 BFs) causes overflow. >> >> >> Other than using block floating OR increasing my precision -- is there any >> other way of achieving better results (not so many 0s)? I also want my >> design to be protected from overflow under any input conditions (assume >> this is general purpose) >> >> Thank you. > >I simply don't see, for all but some predefined signal with known >characteristics, how a 1024 bin FFT is going to benefit you in any way >with such low precision. As a rule of thumb, to catch everything that's >going on you should have an output precision that's 10 bits deeper than >your input -- so 8 bits out is just insufficient, any way you slice it. > >-- >Tim Wescott >Control system and signal processing consulting >www.wescottdesign.com Actually I use a randomly generated (uniformly distributed) input that gives me pretty good (SNR of 39dB) outputs when I use Block floating point with the precision of 8 bits. >
From: Steve Pope on 6 May 2010 13:39 onkars <onkar.sarode(a)n_o_s_p_a_m.gmail.com> wrote: > (Tim wrotes) >>onkars wrote: >>> Hi, I am a student working with the Xilinx C model of the FFT. [snip precision discussion] >>I simply don't see, for all but some predefined signal with known >>characteristics, how a 1024 bin FFT is going to benefit you in any way >>with such low precision. As a rule of thumb, to catch everything that's >>going on you should have an output precision that's 10 bits deeper than >>your input -- so 8 bits out is just insufficient, any way you slice it. >Actually I use a randomly generated (uniformly distributed) input that >gives me pretty good (SNR of 39dB) outputs when I use Block floating point >with the precision of 8 bits. That's a good way to do it. In more detail, I would follow a procedure along the lines of the following: (1) Implement the FFT in high precision, such as double-precision floating point. (2) Run through this a series of randomly-generated test cases, at different RMS levels over the dynamic range of interest. Save the resulting inputs and outputs as test vectors. (3) Implement the FFT at the designed target precisions -- input, internal, and output. (4) Run the same vectors through this fixed-point version, and compare its output to that of the full-precision version. From this, generate a plot of RMS error vs. input level, and determine if it meets your requirements. If it does not, revise the precision and go back to step (3) and try again. (What you might probably find is that you need to keep around 4 to 6 underflow bits at each internal stage, and that your output needs to have at least two bits greater precision than your input. But it depends on details of both your requirements and your implementations.) Steve
From: onkars on 6 May 2010 14:08 >onkars <onkar.sarode(a)n_o_s_p_a_m.gmail.com> wrote: > >> (Tim wrotes) > >>>onkars wrote: > >>>> Hi, I am a student working with the Xilinx C model of the FFT. > >[snip precision discussion] > >>>I simply don't see, for all but some predefined signal with known >>>characteristics, how a 1024 bin FFT is going to benefit you in any way >>>with such low precision. As a rule of thumb, to catch everything that's >>>going on you should have an output precision that's 10 bits deeper than >>>your input -- so 8 bits out is just insufficient, any way you slice it. > >>Actually I use a randomly generated (uniformly distributed) input that >>gives me pretty good (SNR of 39dB) outputs when I use Block floating point >>with the precision of 8 bits. > >That's a good way to do it. > >In more detail, I would follow a procedure along the lines of >the following: > >(1) Implement the FFT in high precision, such as double-precision >floating point. > >(2) Run through this a series of randomly-generated test cases, at >different RMS levels over the dynamic range of interest. Save >the resulting inputs and outputs as test vectors. > >(3) Implement the FFT at the designed target precisions -- >input, internal, and output. > >(4) Run the same vectors through this fixed-point version, and >compare its output to that of the full-precision version. >From this, generate a plot of RMS error vs. input level, and >determine if it meets your requirements. > >If it does not, revise the precision and go back to step (3) >and try again. > >(What you might probably find is that you need to keep around 4 to >6 underflow bits at each internal stage, and that your output >needs to have at least two bits greater precision than your input. >But it depends on details of both your requirements and your >implementations.) > >Steve > @Steve .. thank you for the response. I am sorry but I don't understand what you mean by "keep around 4 to 6 underflow bits at each internal stage"
|
Pages: 1 Prev: Image compression advice Next: about dead-band used in pwm |