From: Antti on 21 Dec 2009 14:58 On Dec 21, 9:50 pm, Peter Alfke <al...(a)sbcglobal.net> wrote: > On Dec 21, 9:30 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > > > On Dec 21, 7:20 pm, Ed McGettigan <ed.mcgetti...(a)xilinx.com> wrote: > > > > On Dec 21, 3:01 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > On Dec 21, 12:56 pm, Symon <symon_bre...(a)hotmail.com> wrote: > > > > > > Antti wrote: > > > > > > > Xilinx Coregen FIFO, dual clock, most options disable, only FULL EMPTY > > > > > > flags present. > > > > > > > signals at input correct, as expected (checked with ChipScope) > > > > > > signals at output: > > > > > > - double value > > > > > > - missing 1, 2 or 3 values > > > > > > - FIFO will read out random number of OLD entries, this could be 4 > > > > > > values, or 50% of the FIFO old values > > > > > > I know you will have read this. > > > > > > Can you think of any reason why the Xilinx work-around wouldn't work > > > > > because of your specific implementation? It seems to have different > > > > > work-arounds depending on whether the read clock is faster or slower > > > > > than the write clock. Do your clocks change frequency? > > > > > > Are you sure your clocks don't have any glitches? The reset also? > > > > > Power's OK? Is your office made of Cobalt 60? > > > > > > HTH., Syms. > > > > > 1) I entered the clock figures in FIFO16 implementationm, but the > > > > error also happens with BRAM based FIFO that do not need workarounds > > > > 2) Clocks DO NOT CHANGE ever, one is MGT recovered clock 125MHz write, > > > > one is PLB clock 62.5MHz read > > > > 3) Power OK? Well the problem happens at 2 different sites, hm yes it > > > > could be still be power problem > > > > > 4) My office is not of Cobalt 60, ... and its cold here too > > > > > Antti- Hide quoted text - > > > > > - Show quoted text - > > > > Are you sure that this is a FIFO issue and not something else? Some > > > things to think about. > > > > 1) The recovered clock from the MGT is a bit noisy as it moves as the > > > CDR moves. Why are you using this instead of the REFCLK source? > > > > 2) It seems like you have a PLB core that is reading from the FIFO, > > > could the problem be in this? > > > > Ed McGettigan > > > -- > > > Xilinx Inc. > > > Well the MGT datapath and clock system is not done by me, and the guy > > says it is OK all the way it is connected. > > > yes, It is very unlikely to belive that all THREE types of coregen > > FIFO's fail with about same symptoms, but in all > > 3 cased Chipscope sees correct data into fifo, and trash coming out > > > the system can span up to 100 boards, all synced to master unit, the > > local refclk is not fully sync to the clock of > > the master unit, so I see no way to use this clock to syncronise the > > fifo? > > > Antti > > PS I just received a attempt to collect the reward, by using non > > xilinx FIFO implementation, i let you all know > > the test results > > Antti > If I remember right (I am no longer at Xilinx) the FIFO is NOT > designed for unequal data width of write and read. (Reason: possible > ambiguity of Full and EMPTY) > Since you use two clocks that are roughly 2:1 in frequency, I hope > that you do not try to have double width on one of the ports. > The FIFO must have the same width on both ports. You must design the > width conversion outside the FIFO. That little circuit will be > synchronous and thus quite simple. > Peter Alfke well the FIFO is 9b in 9b out so it should work? at least this is what i hoped... we did not suspect the FIFO as problem at first so spent LOT of time looking for the problem AROUND the FIFOS but.. at least based on what i can see from CS snapshots on fifo inputs and outputs, the only explanation i have is that the FIFO are just goind mad, of course one option is that its me doing, but i have someone who is in better shape looking over the code as well, and he sees no issues there either. I know the FIFOs should work so there must be some explanation, but so far failing to see it. Antti PS thank you Peter for the response
From: Peter Alfke on 21 Dec 2009 15:21 On Dec 21, 11:58 am, Antti <antti.luk...(a)googlemail.com> wrote: > On Dec 21, 9:50 pm, Peter Alfke <al...(a)sbcglobal.net> wrote: > > > > > > > On Dec 21, 9:30 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > On Dec 21, 7:20 pm, Ed McGettigan <ed.mcgetti...(a)xilinx.com> wrote: > > > > > On Dec 21, 3:01 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > > On Dec 21, 12:56 pm, Symon <symon_bre...(a)hotmail.com> wrote: > > > > > > > Antti wrote: > > > > > > > > Xilinx Coregen FIFO, dual clock, most options disable, only FULL EMPTY > > > > > > > flags present. > > > > > > > > signals at input correct, as expected (checked with ChipScope) > > > > > > > signals at output: > > > > > > > - double value > > > > > > > - missing 1, 2 or 3 values > > > > > > > - FIFO will read out random number of OLD entries, this could be 4 > > > > > > > values, or 50% of the FIFO old values > > > > > > > I know you will have read this. > > > > > > > Can you think of any reason why the Xilinx work-around wouldn't work > > > > > > because of your specific implementation? It seems to have different > > > > > > work-arounds depending on whether the read clock is faster or slower > > > > > > than the write clock. Do your clocks change frequency? > > > > > > > Are you sure your clocks don't have any glitches? The reset also? > > > > > > Power's OK? Is your office made of Cobalt 60? > > > > > > > HTH., Syms. > > > > > > 1) I entered the clock figures in FIFO16 implementationm, but the > > > > > error also happens with BRAM based FIFO that do not need workarounds > > > > > 2) Clocks DO NOT CHANGE ever, one is MGT recovered clock 125MHz write, > > > > > one is PLB clock 62.5MHz read > > > > > 3) Power OK? Well the problem happens at 2 different sites, hm yes it > > > > > could be still be power problem > > > > > > 4) My office is not of Cobalt 60, ... and its cold here too > > > > > > Antti- Hide quoted text - > > > > > > - Show quoted text - > > > > > Are you sure that this is a FIFO issue and not something else? Some > > > > things to think about. > > > > > 1) The recovered clock from the MGT is a bit noisy as it moves as the > > > > CDR moves. Why are you using this instead of the REFCLK source? > > > > > 2) It seems like you have a PLB core that is reading from the FIFO, > > > > could the problem be in this? > > > > > Ed McGettigan > > > > -- > > > > Xilinx Inc. > > > > Well the MGT datapath and clock system is not done by me, and the guy > > > says it is OK all the way it is connected. > > > > yes, It is very unlikely to belive that all THREE types of coregen > > > FIFO's fail with about same symptoms, but in all > > > 3 cased Chipscope sees correct data into fifo, and trash coming out > > > > the system can span up to 100 boards, all synced to master unit, the > > > local refclk is not fully sync to the clock of > > > the master unit, so I see no way to use this clock to syncronise the > > > fifo? > > > > Antti > > > PS I just received a attempt to collect the reward, by using non > > > xilinx FIFO implementation, i let you all know > > > the test results > > > Antti > > If I remember right (I am no longer at Xilinx) the FIFO is NOT > > designed for unequal data width of write and read. (Reason: possible > > ambiguity of Full and EMPTY) > > Since you use two clocks that are roughly 2:1 in frequency, I hope > > that you do not try to have double width on one of the ports. > > The FIFO must have the same width on both ports. You must design the > > width conversion outside the FIFO. That little circuit will be > > synchronous and thus quite simple. > > Peter Alfke > > well the FIFO is 9b in 9b out so it should work? > at least this is what i hoped... > > we did not suspect the FIFO as problem at first > so spent LOT of time looking for the problem AROUND the FIFOS > but.. at least based on what i can see from CS snapshots on fifo > inputs and outputs, the only explanation i have is that the FIFO > are just goind mad, > > of course one option is that its me doing, but i have someone > who is in better shape looking over the code as well, and he > sees no issues there either. I know the FIFOs should work > so there must be some explanation, but so far failing to see it. > > Antti > PS thank you Peter for the response OK, Antti, so you have the same port width, but one clock is about twice as fast as the other. How do you stop the 125 MHz write clock from filling up the FIFO, since you read at only 62 MHz ? I hope you are not gating the clock, but rather run it continuously and use WE to stop the writing. Yes, many of these suggestions are well below your level, but stupid problems need stupid investigations. Cheers Peter
From: Antti on 21 Dec 2009 16:12 On Dec 21, 10:21 pm, Peter Alfke <al...(a)sbcglobal.net> wrote: > On Dec 21, 11:58 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > > > On Dec 21, 9:50 pm, Peter Alfke <al...(a)sbcglobal.net> wrote: > > > > On Dec 21, 9:30 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > On Dec 21, 7:20 pm, Ed McGettigan <ed.mcgetti...(a)xilinx.com> wrote: > > > > > > On Dec 21, 3:01 am, Antti <antti.luk...(a)googlemail.com> wrote: > > > > > > > On Dec 21, 12:56 pm, Symon <symon_bre...(a)hotmail.com> wrote: > > > > > > > > Antti wrote: > > > > > > > > > Xilinx Coregen FIFO, dual clock, most options disable, only FULL EMPTY > > > > > > > > flags present. > > > > > > > > > signals at input correct, as expected (checked with ChipScope) > > > > > > > > signals at output: > > > > > > > > - double value > > > > > > > > - missing 1, 2 or 3 values > > > > > > > > - FIFO will read out random number of OLD entries, this could be 4 > > > > > > > > values, or 50% of the FIFO old values > > > > > > > > I know you will have read this. > > > > > > > > Can you think of any reason why the Xilinx work-around wouldn't work > > > > > > > because of your specific implementation? It seems to have different > > > > > > > work-arounds depending on whether the read clock is faster or slower > > > > > > > than the write clock. Do your clocks change frequency? > > > > > > > > Are you sure your clocks don't have any glitches? The reset also? > > > > > > > Power's OK? Is your office made of Cobalt 60? > > > > > > > > HTH., Syms. > > > > > > > 1) I entered the clock figures in FIFO16 implementationm, but the > > > > > > error also happens with BRAM based FIFO that do not need workarounds > > > > > > 2) Clocks DO NOT CHANGE ever, one is MGT recovered clock 125MHz write, > > > > > > one is PLB clock 62.5MHz read > > > > > > 3) Power OK? Well the problem happens at 2 different sites, hm yes it > > > > > > could be still be power problem > > > > > > > 4) My office is not of Cobalt 60, ... and its cold here too > > > > > > > Antti- Hide quoted text - > > > > > > > - Show quoted text - > > > > > > Are you sure that this is a FIFO issue and not something else? Some > > > > > things to think about. > > > > > > 1) The recovered clock from the MGT is a bit noisy as it moves as the > > > > > CDR moves. Why are you using this instead of the REFCLK source? > > > > > > 2) It seems like you have a PLB core that is reading from the FIFO, > > > > > could the problem be in this? > > > > > > Ed McGettigan > > > > > -- > > > > > Xilinx Inc. > > > > > Well the MGT datapath and clock system is not done by me, and the guy > > > > says it is OK all the way it is connected. > > > > > yes, It is very unlikely to belive that all THREE types of coregen > > > > FIFO's fail with about same symptoms, but in all > > > > 3 cased Chipscope sees correct data into fifo, and trash coming out > > > > > the system can span up to 100 boards, all synced to master unit, the > > > > local refclk is not fully sync to the clock of > > > > the master unit, so I see no way to use this clock to syncronise the > > > > fifo? > > > > > Antti > > > > PS I just received a attempt to collect the reward, by using non > > > > xilinx FIFO implementation, i let you all know > > > > the test results > > > > Antti > > > If I remember right (I am no longer at Xilinx) the FIFO is NOT > > > designed for unequal data width of write and read. (Reason: possible > > > ambiguity of Full and EMPTY) > > > Since you use two clocks that are roughly 2:1 in frequency, I hope > > > that you do not try to have double width on one of the ports. > > > The FIFO must have the same width on both ports. You must design the > > > width conversion outside the FIFO. That little circuit will be > > > synchronous and thus quite simple. > > > Peter Alfke > > > well the FIFO is 9b in 9b out so it should work? > > at least this is what i hoped... > > > we did not suspect the FIFO as problem at first > > so spent LOT of time looking for the problem AROUND the FIFOS > > but.. at least based on what i can see from CS snapshots on fifo > > inputs and outputs, the only explanation i have is that the FIFO > > are just goind mad, > > > of course one option is that its me doing, but i have someone > > who is in better shape looking over the code as well, and he > > sees no issues there either. I know the FIFOs should work > > so there must be some explanation, but so far failing to see it. > > > Antti > > PS thank you Peter for the response > > OK, Antti, > so you have the same port width, but one clock is about twice as fast > as the other. > How do you stop the 125 MHz write clock from filling up the FIFO, > since you read at only 62 MHz ? > I hope you are not gating the clock, but rather run it continuously > and use WE to stop the writing. > Yes, many of these suggestions are well below your level, but stupid > problems need stupid investigations. > Cheers > Peter I am level below ground right now the project is just driving me nuts. slowly. To work for months, and end up with Xilinx saying: The man who could have helped you, left Xilinx last friday. Your situation is unsupportable. Well we got out of that situation. To end up in the new ones. The FIFO is never over filled by design. The fiber link is 99% IDLE sending usually only short 10byte packets over the link. For tesing I generate 10 byte pakets with MOUSE so 1 per second so there is no doubt the FIFO is never near full at all. Last results: - ALL 3 types of Xilinx FIFO's same style of errors, about same error rate - VHDL FIFO send by CAF reader, uses gray counters, about TEN TIMES LESS errors then Xilinx implementation, but still all different types of error did occour: missing values, and FIFO outputtin large junk of OLD values, that is read pointer changing by some random value again, I did not design the MGT clocking and the overall MGT subsystem, the people who did are either unreachable or unable to provide any help beyound saying that the implementation (connection of the FIFO) is done properly. It is also what I have figured out so far, but.. well somewhere must be problem. Antti
From: adamk on 21 Dec 2009 16:25 On Dec 21, 7:17 pm, Antti <antti.luk...(a)googlemail.com> wrote: > I hope my plea will not be seen as usual "please help me" request. I > do my (home)work, I try hard but sometimes there come up problems that > seem very hard to solve, with the current problem, well if there is no > solution to that, then I wonder how come it has been ever been > possible to use Xilinx FIFO's with problem at all? So the problem: > > Xilinx Coregen FIFO, dual clock, most options disable, only FULL EMPTY > flags present. > > signals at input correct, as expected (checked with ChipScope) > signals at output: > - double value > - missing 1, 2 or 3 values > - FIFO will read out random number of OLD entries, this could be 4 > values, or 50% of the FIFO old values > > I can select BRAM or FIFO16 implementation in Coregen, it doesnt > change the problem > > Virtex-4, ISE 10.1SP3 > > Please help me, if anyone has some good suggestion (except use Altera > advice), I am getting really desperate. To the extent that when i > friend called my yesterday, then after my "hello", his first response > was: "Are you dead?". I had to explain that i am not. > > Antti I'm not sure if this is related but i had a similar problem when using the FSL v2.11.a core in EDK 10.1 SP3. In my case it was set up in synchronous mode. The problem was lost data whenever there was a simultaneous read and write at the same time when there was exactly 1 word of data in the FIFO. It was fixed in v2.11.b of the core in EDK 11. Perphaps there is something similar happening in the coregen FIFO. Cheers Adam
From: Nico Coesel on 21 Dec 2009 16:29
Antti <antti.lukats(a)googlemail.com> wrote: >On Dec 21, 3:21=A0pm, John McCaskill <jhmccask...(a)gmail.com> wrote: >> On Dec 21, 5:42=A0am, Antti <antti.luk...(a)googlemail.com> wrote: >> >> >> >> >> >> > On Dec 21, 1:29=A0pm, "maxascent" <maxasc...(a)yahoo.co.uk> wrote: >> >> > > >On Dec 21, 12:32=3DA0pm, "maxascent" <maxasc...(a)yahoo.co.uk> wrote: >> > > >> Well once you have written and tested your own fifo then you would= > have >> > > i=3D >> > > >t >> > > >> for any other project. It seems like you have wasted a lot of time >> > > alread=3D >> > > >y >> > > >> trying to fix the Xilinx version so I dont see that you have anyth= >ing >> > > to >> > > >> loose by creating your own. >> >> > > >> Jon =3DA0 =3DA0 =3DA0 =3DA0 >> >> > > >If you REALLY need todo something else, when your time is at absolut= >e >> > > >premium >> > > >And if the system working (except occasional errors about 2 of fiber >> > > >packets are corrupt) >> > > >Then you do not go replacing Xilinx validated FIFO solutions with yo= >ur >> > > >own, if there are other options. >> >> > > >If 2 completly different FIFO implementations both have same error? >> > > >you think 3rd one would instantly work? Could be, yes. >> >> > > >Antti >> >> > > In my opinion people tend to use coregen far too often. Looking throu= >gh >> > > some of Xilinx code it is awfull. I went down the route of writing my= > own >> > > fifos not because I had a problem with Xilinx fifos but because I bel= >ieve a >> > > fifo written by myself is a lot more flexible and simulates faster th= >an the >> > > Xilinx version. I also know to as good a degree as I can test that it= > will >> > > work 100%. >> > > I dont really think you can say that their fifos have been validated = >100% >> > > if they have to release patches for them. >> >> > > Jon =A0 =A0 =A0 =A0 >> >> > Dear Jon, >> >> > I do not feel to be in health right now to write this fifo, so here is >> > the deal: >> >> > =A0 component mgt_fifo >> > =A0 =A0 port ( >> > =A0 =A0 =A0 din =A0 =A0: in =A0std_logic_vector(8 downto 0); >> > =A0 =A0 =A0 rd_clk : in =A0std_logic; >> > =A0 =A0 =A0 rd_en =A0: in =A0std_logic; >> > =A0 =A0 =A0 rst =A0 =A0: in =A0std_logic; >> > =A0 =A0 =A0 wr_clk : in =A0std_logic; >> > =A0 =A0 =A0 wr_en =A0: in =A0std_logic; >> > =A0 =A0 =A0 dout =A0 : out std_logic_vector(8 downto 0); >> > =A0 =A0 =A0 empty =A0: out std_logic; >> > =A0 =A0 =A0 full =A0 : out std_logic); >> > =A0 end component; >> >> > if you can write fifo that i can "drop in" and the Xilinx FIFO error >> > is gone, >> > then i will stand up, go to postal office and send you 1000 EUR by >> > western union. >> > If 1000 EUR is not enough, name your price, i will consider it. >> > there is no price on the health of our family >> >> > condition is: DROP IN, WORKS, if i need to troubleshoot, then no pay. >> >> > Antti >> >> Hello Antti, >> >> If you want to try a different implementation of a FIFO, you can get >> the one that the FSL bus uses out of the EDK pcores directory at C: >> \Xilinx\11.1\EDK\hw\XilinxProcessorIPLib\pcores\fsl_v20_v2_11_a\hdl >> \vhdl. >> >> There are multiple implementations, including an async BRAM based one >> that has the same ports as you list above, except that it uses exist >> instead of empty on the read port. >> >> That said, I don't expect a third implementation to work instantly >> when the previous two implementations had the same error. =A0This FIFO >> has the full source to it, so it is straight forward to see how it >> works, and add ChipScope to observe what is happening around the time >> of the error. >> >> If you have not used it before, FPGA editor has the ability to find a >> ChipScope ILA core, and change what is connected to it. That can make >> it much quicker to follow the trail of clues since you avoid having to >> go through a full place and route every time you want to look at >> something different. >> >> Is your 62.5 MHz clock a divided version of the 125 MHz clock? You >> mention that the 125 MHz is the recovered clock from the MGT, but >> there are other options. =A0When we did our GigE interface, we used a >> 125 MHz clock from the MGT, but it was not the recovered clock, but >> the local MGT PLL. =A0This let us use the same 125 MHz clock for all >> four GigE interfaces and a PMCD to generate a 62.5 MHz clock that is >> phase aligned with the 125 MHz clock. >> >> Regards, >> >> John McCaskillwww.FasterTechnology.com > >Hi > >I have tried all 3 variants possible with coregen, >all 3 have similar errors > >and no, the clocks are not divided version, the 125MHz comes from >master over fiber >the master could be 100 hops away, the 62.5mhz is derived from local >oscillator > >so the frequencier are very close but not synchron > >Antti >who has to give up, at least for a while :( >good advice still welcome, if there is any hope or idea how to fix the >issue >and yes it could be power supply issue at the end of the day also I always write my own fifo's to keep things simple. I keep a write pointer, read pointer and number of elements counter in the domain with the highest clock frequency. I don't cross the clock domain inside the fifo instead I create an interface which does the clock domain crossing. I also use an early full signal (say max. elements -X depending on the expected latency). This allows for fast FIFO's (no cray code counters) with very little logic. The control logic looks like this: if read then read_ptr++; if write then write_ptr++; if (read=true and write=false) num_elements--; if (write=true and read=false) num_elements++; if (num_elements>=(MAX_ELEMENTS-X)) full=true; else full=false; if (num_elements==0) empty=true; The external logic should prohibit itself from reading/writing fifo when its empty or full. Besides: could your problem be a timing constraint problem? Did you specify the amount of time signals may travel from one clock domain to the other? The Xilinx tools are not doing this automatically! -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... "If it doesn't fit, use a bigger hammer!" -------------------------------------------------------------- |