Prev: Programming Digilent Nexys 2 from Linux
Next: Estimating resource utilization of cores (from Xilinx CoreGen)
From: rickman on 2 Jun 2010 11:41 Sym, You write without knowing anything about me. I have a very high rate of success on the board because of the extensive simulations I run. But you can never eliminate the need to probe real hardware. As to the SI issues, sure, you can get very high speed signals out of an FPGA. You can also drive static signals and everything in between. Many of my designs work very well in QFP packages and have no need for special SI approaches. When I am running a 32 MHz clock, 1 ns edge rates are not needed, so I slow them down to help prevent SI problems. Oh, yeah, I "do* have a logic analyzer and it comes in very useful. I was able to use it recently to find a configuration problem where my customer had missed an aspect of how to properly initialize the digital PLL used in the FPGA. No amount of simulation would have caught that! Rick
From: rickman on 2 Jun 2010 11:59 On Jun 1, 9:11 pm, -jg <jim.granvi...(a)gmail.com> wrote: > On Jun 2, 7:25 am, rickman <gnu...(a)gmail.com> wrote: > > > > > The software would have about 100 ns from the last address "chunk" > > being clocked, 60 ns from the command flag going high and much less > > than 30 ns from the command being clocked to driving the first output > > bit. I doubt it can be done at 62.5 MIPs. > > What's the data rate ? > > We have done a number of systems, where a smallish CPLD takes the ns- > level stuff, and dual-edge etc and converts it into a more micro- > compatible form. > > Sometimes that has been parallel, and sometimes SPI/SSC The strobe the clocks the data is 33 ns high and 67 ns low. So the clock rate is 10 MHz with two data/addr bit input or 1 data bit output on each strobe. The problem with using a small CPLD is that the register set is up to 32, 8-bit registers. With a 100 ns address to output time, there is little chance of the read being done unless a copy of all registers exists in the CPLD. Also, some of the bits to be read are real time status bits. If the processor can get an interrupt, read the address and write the readback data to the CPLD, then it could work, but it has to happen in 100 ns. If they had just used a standard SPI interface it would have been a lot easier... Rick
From: Marc Jet on 2 Jun 2010 12:01 The description is somewhat vague about the timing. From what I understand, the main problem is the short response latency. In fact the problem sounds very much like a job for a small CPLD. Your micro runs at 62.5 MIPS (16ns instruction cycle)? If it's a fast small micro with no pipeline, then that's a good start. Given that the protocol samples on the falling edge, does it offer you valid inputs already at the falling edge too? If so, then you seem to have <130ns from when the LSB of address is known to data out. And you seem to have <190ns from when ADR<MSB to 2> is known. 190ns would be 11 instructions, which looks okay. In a 74xx type solution, I'd dedicate 4 micro outputs to offer all 4 possible data bits to a 74 MUX which uses the 2 address LSBs to select the correct data bit. This relaxes the software latency constraint considerable. I'd store the "memory content" in an array indexed by ADR<MSB to 2> and store the 4 possible data LSBs in that byte (correctly ordered for the output port). Then the software loop has 11 instructions to assemble ADR<MSB to 2>, do the 1 byte lookup, and write it to the output port. This solves only one detail of the problem. Maybe it inspires you to find a solution for it all. Best regards
From: rickman on 2 Jun 2010 14:50 On Jun 2, 12:01 pm, Marc Jet <jetm...(a)hotmail.com> wrote: > The description is somewhat vague about the timing. From what I > understand, the main problem is the short response latency. In fact > the problem sounds very much like a job for a small CPLD. > > Your micro runs at 62.5 MIPS (16ns instruction cycle)? If it's a fast > small micro with no pipeline, then that's a good start. > > Given that the protocol samples on the falling edge, does it offer you > valid inputs already at the falling edge too? If so, then you seem to > have <130ns from when the LSB of address is known to data out. And > you seem to have <190ns from when ADR<MSB to 2> is known. 190ns would > be 11 instructions, which looks okay. > > In a 74xx type solution, I'd dedicate 4 micro outputs to offer all 4 > possible data bits to a 74 MUX which uses the 2 address LSBs to select > the correct data bit. This relaxes the software latency constraint > considerable. I'd store the "memory content" in an array indexed by > ADR<MSB to 2> and store the 4 possible data LSBs in that byte > (correctly ordered for the output port). Then the software loop has > 11 instructions to assemble ADR<MSB to 2>, do the 1 byte lookup, and > write it to the output port. > > This solves only one detail of the problem. Maybe it inspires you to > find a solution for it all. > > Best regards Thanks for the advice. That's an interesting approach. There is a bit more to it than that, but it sounds potentially doable depending on the instructions required. The protocol was never intended for software, so no provision was made for the response time of software. In fact, register 0 only uses a 4 bit address while the others can be longer depending on the target. All this is pretty easy in hardware, but not so much in software. So this is not the only issue. The other issue is that the XMOS device is only an improvement over an FPGA in that it can include a lot more logic in a small package. The power consumption is higher if all eight threads are running. I don't know what happens to power if only some are idling. Since you can't slow the clock while any one thread has to run at high speed, I should think that would limit the power reduction you can achieve. Rick
From: -jg on 2 Jun 2010 17:29
On Jun 3, 3:59 am, rickman <gnu...(a)gmail.com> wrote: > > The problem with using a small CPLD is that the register set is up to > 32, 8-bit registers. With a 100 ns address to output time, there is > little chance of the read being done unless a copy of all registers > exists in the CPLD. Also, some of the bits to be read are real time > status bits. If the processor can get an interrupt, read the address > and write the readback data to the CPLD, then it could work, but it > has to happen in 100 ns. If they had just used a standard SPI > interface it would have been a lot easier... If this interface is so incompatible with SPI that you need 32 bytes of local memory, then you are bumped into the 'smallest CPLD with RAM' territory, - and the choice there is not great. Maybe Actel or SiliconBlue ? I've hit this wall myself, and it raises a point: Rather than the uC+CPLD the marketing types are chasing, I would find a CPLD+RAM more useful, as there are LOTS of uC out there already, and if they can make 32KB SRAM for sub $1, they should be able to include it almost for free, in a medium CPLD. -jg |