Prev: Embedded clocks
Next: CPU design
From: Martin Schoeberl on 12 Aug 2006 16:30 that's almost like chatting - high speed news group discussion ;-) to keep up with your speed I've to split the answers according to the sub topics. Here about the Avalon SRAM interface. >> Yes, but e.g. for an SRAM interface there are some timings in ns. And >> it's not that clear how this translates to wait states. > > Since Avalon is not directly compatible the typical SRAMs, this implies that Again disagree ;-) The Avalon specification also covers asynchronous peripherals. That's adds to a little bit to the complexity of the specification. > Assuming for the moment, that you wanted to write the code for such a component, one would likely define that the component to > have the following: > - A set of Avalon bus signals > - SRAM Signals that are defined as Avalon 'external' (i.e. they will get exported to the top level) so that they can be brought > out of the FPGA. > - Generic parameters so that the actual design code does not need to hard code any of the specific SRAM timing requirements. Yes, that's the way it is described in the Quartus manual. I did my SRAM interface in this way. Here is a part of the .ptf that describes the timing of the external SRAM: SLAVE sram_tristate_slave { SYSTEM_BUILDER_INFO { .... Setup_Time = "0ns"; Hold_Time = "2ns"; Read_Wait_States = "18ns"; Write_Wait_States = "10ns"; Read_Latency = "0"; .... > Given that, the VHDL code inside the SRAM controller would set it's Avalon side wait request high as appropriate while it > physically performs the There is no VHDL code associated with this SRAM. All is done by the SOPC builder. > read/write to the external SRAM. The number of wait states would be roughly equal to the SRAM cycle time divided by the Avalon > clock cycle time. The SOPC builder will translate the timing from ns to clock cycles for me. However, this is a kind of iterative process as the timing of the component depends on tco and tsu of the FPGA pins of the compiled design. Input pin th can usually be ignored as it is covered by the minimum tco of the output pins. The same is true for the SRAM write th. > Although maybe it sounds like a lot of work and you may think it results in some sort of 'inefficient bloat' it really isn't. Any > synthesizer will quickly reduce the logic to what is needed based on the usage of the design. What you get in exchange is very > portable and reusable components. No, it's really not much work. Just a few mouse clicks (no VHDL) and the synthesized result is not big. The SRAM tristate bridge contains just the address and control output registers. I assume the input registers are somwhere burried in the arbitrator. Martin
From: Martin Schoeberl on 12 Aug 2006 17:03 > Not really, it is just simpler to say that I'm not going to go anywhere near code that can potentially change any of the outputs > if wait request is active. As an example, take a look at your code below where you've had to sprinkle the 'if av_waitrequest = > '0' throughout the code to make sure you don't change states at the 'wrong' time (i.e. when av_waitrequest is active). Where > problems can come up is when you miss one of those 'if av_waitrequest = '0' statements. Depending on just where exactly you > missed putting it in is is where it can be a rather subtle problem to debug. Agree on the save side, but... > > Now consider if you had simply put the 'if av_waitrequest = '0' statement around your entire case statement (with it understood > that outside that I cannot do this. This case statement is combinatoric. It would introduce a latch for next_state. The reason to split up the state machine in a combinatoric next state logic and the clocked part is to react 'one cycle earlier' with state machine output registers depending on next_state. You can code this also with a single case in a clock process. However, than you have to code your output registers on the transitions (in the if part), which gets a little bit more confusing. >> >> What about this version (sc_* signals are my internal master signals) >> >> that case is the next state logic and combinatorial: the process containing this case statement is: process(state, sc_rd, sc_wr, av_waitrequest) begin next_state <= state; >> >> case state is >> >> when idl => >> if sc_rd='1' then >> if av_waitrequest='0' then >> next_state <= rd; >> else >> next_state <= rdw; >> end if; >> elsif sc_wr='1' then >> if av_waitrequest='0' then >> next_state <= wr; >> else >> next_state <= wrw; >> end if; >> end if; >> >> when rdw => >> if av_waitrequest='0' then >> next_state <= rd; >> end if; >> > >> when rd => >> next_state <= idl; > --- Are you sure you always want to go to idl? This would probably cause an error if the avalon outputs were active in this > state. No problem as next_state goes to rd only when av_waitrequest is '0'. Perhaps 'rd' is a missleading state name. The input data is registered when next_state is 'rd'. So state is 'rd' when the input data is registered. > > Whether it works or not for you would take more analysis, I'll just say that For a complete picture you can look at the whole thing at: http://www.opencores.org/cvsweb.cgi/~checkout~/jop/vhdl/scio/sc2avalon.vhd >>> You might try looking at incorporating the above mentioned template and avoid the Avalon violation. What I've also found in >>> debugging other's code >> >> Then I get an additional cycle latency. That's what I want to avoid. > > Not on the Avalon bus, maybe for getting stuff into the template but even that is a handshake. I've even used Avalon within > components to transfer Ok, than not at the Avalon bus directly but as you sayed 'getting stuff into the template'. That's the same for me (in my case). If my master has a (internal) read request and I have to forward it to Avalon in a clocked process (as you do with your template) I will loose one cycle. Ok in the interface and not in the bus. Still a lost cycle ;-) > data between rather complicated processes just because it is a clean data transfer interface and still have no problem > transferring data on every clock cycle when it is available. I'm not familiar enough with your code, but I suspect that it can be > done in your case as well. You can do it when your template 'controls' the master logic but not the other way round. >> And then it goes on with slaves with fixed wait states. Why? >> If do not provide a waitrequest in a slave that needs wait >> states you can get into troubles when you specify it wrong >> at component genration. > > No, PTF files let you state that there are a fixed number of wait states and not have an explicit waitrequest on the slave. I meant when you assume n wait states in your VHDL code, but did a mistake in the PTF file and specified less wait states. This erro cannot happen when you generate the waitrequest within your VHDL code. >> >> Or does the Avalon switch fabric, when registered, take this >> information into account for the waitrequest of the master? > > It does. That's a reason to go with fix wait states! Or a bus specification that counts down the number of wait states ;-) BTW: Did you take a look into the SimpCon idea? Dreaming a little bit: Would be cool to write an open-source system generator (like SOPC builder) for it. Including your suggestion of an open and documented specification file format. > >> Could be for the SRAM component. Should look into the >> generated VHDL code (or in a simulation)... >> > I'd suggest looking at the system.ptf file for your design. It's still in ns, which makes sense. Martin >> Again, one more cycle latency ;-) > Again, nope not if done correctly. I think we finally agreed, did we? Cheers, Martin
From: Tommy Thorn on 13 Aug 2006 01:15 Wow, this spanned a long thread. Martin Schoeberl wrote: > What helps is to know in advance (one or two cycles) when the result > will be available. That's the trick with the SimpCon interface. That approach is common internally in real cores, but adds a lot of complication while it's an open question how many Avalon application could benefit from it. > There is not a single ack or waitrequest signal, but a counter that > will say how many cycles it will take to provide the result. In this > case I can restart the pipeline earlier. AFAIR, Avalon _does_ support slaves with fixed number of latency cycles, but an SDRAM controller by nature won't be fixed cycles. > Another point is, in my opinion, the wrong role who has to hold data > for more than one cycle. This is true for several busses (e.g. also > Wishbone). For these busses the master has to hold address and write > data till the slave is ready. This is a result from the backplane > bus thinking. In an SoC the slave can easily register those signals > when needed longer and the master can continue. When happens then when you issue another request to a slave which hasn't finished processing the first? Any queue will be finite and eventually you'd have to deal with stalling anyway. Any issue is that there are generally many more slaves than masters so it makes sense to move the complication to the master. .... > Wishbone and Avalon specify just a single cycle data valid. Again, simplify the slave (and the interconnect) and burden the master. Avalon is IMO the best balance between complexity, performance and features in all the (few) interconnect I've seen yet (I haven't seen SimpCon yet). In particular I found Wishbone severely lacking for my needs. Avalon is proprietary though, so I roll my own portable implementation inspired by Avalon with just the features I needed: - all reads are pipelined with variable latency (accept of request is distinct from delivery of data, thus inherently supporting multiple outstanding requests) - multi master support - burst support (actually not implemented yet, but not that hard) It's nearly as trivial as Wishbone, though offers much higher performance. Latency is entirely up to the slave which can deliver data as soon as the cycle after the request was posted. (Though, arriving at this simplicity took a few false starts). > Are there any other data available on that. I did not find many > comments in this group on experiences with Cyclone I and II. Looks > like the CII was more optimized for cost than speed. Yes, waiting > for III ;-) The only mention of Cyclone III I've seen outside this newsgroups was some mentioning in passing on EEtimes that suggested SIII and CIII were expected this year. I just used Cyclone III as a generic term for whatever the next Altera low-cost part is. Regards, Tommy
From: Tommy Thorn on 13 Aug 2006 01:28 Antti Lukats wrote: >>> as very simple example for avalon master-slave type of peripherals there >>> is on free avalon IP core for SD-card support the core can be found >>> at some russian forum and later it was also added to the user ip >>> section of the microtronix forums. >> Any link handy for this example? >> > http://forum.niosforum.com/forum/index.php?showtopic=4430 "Sorry, the link that brought you to this page seems to be out of date or broken." I can see other postings just fine, though. Another reference? Tommy
From: Antti Lukats on 13 Aug 2006 02:11
"Tommy Thorn" <foobar(a)nowhere.void> schrieb im Newsbeitrag news:44DEB88F.50805(a)nowhere.void... > Antti Lukats wrote: >>>> as very simple example for avalon master-slave type of peripherals >>>> there >>>> is on free avalon IP core for SD-card support the core can be found >>>> at some russian forum and later it was also added to the user ip >>>> section of the microtronix forums. >>> Any link handy for this example? >>> >> http://forum.niosforum.com/forum/index.php?showtopic=4430 > > "Sorry, the link that brought you to this page seems to be out of date or > broken." > > I can see other postings just fine, though. Another reference? > > Tommy Tommy the link works, but you may have to register at the niosforum in any case the sd card ip is one of the lasting postings at "post your ip" section at niosforum i dont have an link ready where the download would be accessible without registration sure I can re-upload it somewhere:) antti |