From: Mike Frysinger on 9 May 2010 00:50 On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote: > On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote: >> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote: >> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote: >> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote: >> >> > struct ser_req { >> >> > + u16 sample; >> >> > + char __padalign[L1_CACHE_BYTES - sizeof(u16)]; >> >> > + >> >> > u16 reset; >> >> > u16 ref_on; >> >> > u16 command; >> >> > - u16 sample; >> >> > struct spi_message msg; >> >> > struct spi_transfer xfer[6]; >> >> > }; >> >> >> >> are you sure this is necessary ? ser_req is only ever used with >> >> spi_sync() and it's allocated/released on the fly, so how could >> >> anything be reading that memory between the start of the transmission >> >> and the return to adi7877 ? >> > >> > msg is handed over to spi_sync, it contains the addresses >> > which will be used to programme the DMA: the spi master >> > transfer function will read these fields to start DMA. >> >> so the issue is coming from the SPI master drivers and not the AD7877 driver > > No, the issue is coming from ad7877 placing a transmission buffer > into the same cache line with memory locations that are accessed outside > the driver's scope. you missed the point of my comment. as i clearly explained in the other structure, the AD7877 driver was causing the cache desync. here it is the SPI master that is implicitly causing it. i'm not talking about the AD7877 being correct wrt to the implicit SPI/DMA requirements, just what code exactly is triggering the cache issues. > /* > * DMA (thus cache coherency maintainance) requires the > * transfer buffers to live in their own cache lines. > */ > char __padalign[...]; > > ? It might be obvious what the code does, but I agree with > Mike that it might not be immediately apparent why it's needed. comment looks fine once the spelling is fixed (maintenance). thanks. -mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Oskar Schirmer on 9 May 2010 05:00 On Sun, May 09, 2010 at 00:45:41 -0400, Mike Frysinger wrote: > On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote: > > On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote: > >> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote: > >> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote: > >> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote: > >> >> > struct ser_req { > >> >> > + u16 sample; > >> >> > + char __padalign[L1_CACHE_BYTES - sizeof(u16)]; > >> >> > + > >> >> > u16 reset; > >> >> > u16 ref_on; > >> >> > u16 command; > >> >> > - u16 sample; > >> >> > struct spi_message msg; > >> >> > struct spi_transfer xfer[6]; > >> >> > }; > >> >> > >> >> are you sure this is necessary ? ser_req is only ever used with > >> >> spi_sync() and it's allocated/released on the fly, so how could > >> >> anything be reading that memory between the start of the transmission > >> >> and the return to adi7877 ? > >> > > >> > msg is handed over to spi_sync, it contains the addresses > >> > which will be used to programme the DMA: the spi master > >> > transfer function will read these fields to start DMA. > >> > >> so the issue is coming from the SPI master drivers and not the AD7877 driver > > > > No, the issue is coming from ad7877 placing a transmission buffer > > into the same cache line with memory locations that are accessed outside > > the driver's scope. > > you missed the point of my comment. as i clearly explained in the > other structure, the AD7877 driver was causing the cache desync. here > it is the SPI master that is implicitly causing it. i'm not talking > about the AD7877 being correct wrt to the implicit SPI/DMA > requirements, just what code exactly is triggering the cache issues. In both cases ad7877 did place DMA buffers in the same cache line with reference data needed by spi master to programme the DMA engine. Once the machinery is started thru spi_sync, the other case uses spi_async. Both cases open out into master->transfer via spi_async. In both cases, with drivers/spi/atmel_spi.c, cache lines are flushed and then reference data is fed into the DMA engine, thereby causing the line in question to be cached untimely. Note, that atmel_spi (thus master) is not wrong here, as it must assume DMA buffers being correctly aligned into separate cache lines, so accessing reference data after cache flush is not vicious. So in both cases the problem is caused by ad7877 and thus fixed analoguously. > > > /* > > * DMA (thus cache coherency maintainance) requires the > > * transfer buffers to live in their own cache lines. > > */ > > char __padalign[...]; > > > > ? It might be obvious what the code does, but I agree with > > Mike that it might not be immediately apparent why it's needed. > > comment looks fine once the spelling is fixed (maintenance). thanks. Ok, will prepare that soon. Oskar -- oskar schirmer, emlix gmbh, http://www.emlix.com fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160 geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055 emlix - your embedded linux partner -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mike Frysinger on 10 May 2010 12:50 On Mon, May 10, 2010 at 06:42, Oskar Schirmer wrote: > With dma based spi transmission, data corruption > is observed occasionally. With dma buffers located > right next to msg and xfer fields, cache lines > correctly flushed in preparation for dma usage > may be polluted again when writing to fields > in the same cache line. > > Make sure cache fields used with dma do not > share cache lines with fields changed during > dma handling. As both fields are part of a > struct that is allocated via kzalloc, thus > cache aligned, moving the fields to the 1st > position and insert padding for alignment > does the job. Acked-by: Mike Frysinger <vapier(a)gentoo.org> i'm guessing Dmitry will pick it up now -mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dmitry Torokhov on 10 May 2010 17:00 On Mon, May 10, 2010 at 12:39:49PM -0400, Mike Frysinger wrote: > On Mon, May 10, 2010 at 06:42, Oskar Schirmer wrote: > > With dma based spi transmission, data corruption > > is observed occasionally. With dma buffers located > > right next to msg and xfer fields, cache lines > > correctly flushed in preparation for dma usage > > may be polluted again when writing to fields > > in the same cache line. > > > > Make sure cache fields used with dma do not > > share cache lines with fields changed during > > dma handling. As both fields are part of a > > struct that is allocated via kzalloc, thus > > cache aligned, moving the fields to the 1st > > position and insert padding for alignment > > does the job. > > Acked-by: Mike Frysinger <vapier(a)gentoo.org> > > i'm guessing Dmitry will pick it up now Yep. -- Dmitry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mike Frysinger on 10 May 2010 17:30
On Mon, May 10, 2010 at 17:22, Andrew Morton wrote: > On Mon, 10 May 2010 12:42:34 +0200 "Oskar Schirmer" wrote: >> With dma based spi transmission, data corruption >> is observed occasionally. With dma buffers located >> right next to msg and xfer fields, cache lines >> correctly flushed in preparation for dma usage >> may be polluted again when writing to fields >> in the same cache line. >> >> Make sure cache fields used with dma do not >> share cache lines with fields changed during >> dma handling. As both fields are part of a >> struct that is allocated via kzalloc, thus >> cache aligned, moving the fields to the 1st >> position and insert padding for alignment >> does the job. > > This sounds odd. Doesn't it imply that some code somewhere is missing > some DMA synchronisation actions? i think it's kind of dumb and induces this sort of bug semi-frequently, but it is what the current DMA API requires (see like Documentation/spi/spi-summary) -mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |