Prev: [git pull v3] documentation: fix almost duplicate filenames (io/IO-mapping.txt)
Next: x86, xsave: make init_xstate_buf static
From: Alok Kataria on 22 Jul 2010 14:40 Hi, On Wed, 2010-07-21 at 17:03 -0700, FUJITA Tomonori wrote: > On Thu, 22 Jul 2010 08:44:42 +0900 > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > On Wed, 21 Jul 2010 10:13:34 -0700 > > Alok Kataria <akataria(a)vmware.com> wrote: > > > > > > Basically, you want to add hot-plug memory and enable swiotlb, right? > > > > > > Not really, I am planning to do something like this, > > > > > > @@ -52,7 +52,7 @@ int __init pci_swiotlb_detect(void) > > > > > > /* don't initialize swiotlb if iommu=off (no_iommu=1) */ > > > #ifdef CONFIG_X86_64 > > > - if (!no_iommu && max_pfn > MAX_DMA32_PFN) > > > + if (!no_iommu && (max_pfn > MAX_DMA32_PFN || hotplug_possible())) > > > swiotlb = 1; > > > > Always enable swiotlb with memory hotplug enabled? yep though only on systems which have hotpluggable memory support. > Wasting 64MB on a > > x86_64 system with 128MB doesn't look to be a good idea. I don't think > > that there is an easy solution for this issue though. Good now that you agree that, that's the only feasible solution, do you have any suggestions for any interfaces that are available from SRAT for implementing hotplug_possible ? > > btw, you need more work to enable switch on the fly. > > You need to change the dma_ops pointer (see get_dma_ops()). It means > that you need to track outstanding dma operations per device, locking, > etc. Yeah though if we are doing this during swiotlb_init time i.e. at bootup as suggested in the pseudo patch, we don't need to worry about all this, right ? Thanks, Alok -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on 23 Jul 2010 10:30 On Thu, Jul 22, 2010 at 11:34:40AM -0700, Alok Kataria wrote: > Hi, > > On Wed, 2010-07-21 at 17:03 -0700, FUJITA Tomonori wrote: > > On Thu, 22 Jul 2010 08:44:42 +0900 > > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > > > On Wed, 21 Jul 2010 10:13:34 -0700 > > > Alok Kataria <akataria(a)vmware.com> wrote: > > > > > > > > Basically, you want to add hot-plug memory and enable swiotlb, right? > > > > > > > > Not really, I am planning to do something like this, > > > > > > > > @@ -52,7 +52,7 @@ int __init pci_swiotlb_detect(void) > > > > > > > > /* don't initialize swiotlb if iommu=off (no_iommu=1) */ > > > > #ifdef CONFIG_X86_64 > > > > - if (!no_iommu && max_pfn > MAX_DMA32_PFN) > > > > + if (!no_iommu && (max_pfn > MAX_DMA32_PFN || hotplug_possible())) > > > > swiotlb = 1; > > > > > > Always enable swiotlb with memory hotplug enabled? > > yep though only on systems which have hotpluggable memory support. What machines are there that have hotplug support and no hardware IOMMU? I know of the IBM ones - but they use the Calgary IOMMU. > > > Wasting 64MB on a > > > x86_64 system with 128MB doesn't look to be a good idea. I don't think > > > that there is an easy solution for this issue though. > > Good now that you agree that, that's the only feasible solution, do you > have any suggestions for any interfaces that are available from SRAT for > implementing hotplug_possible ? I thought SRAT has NUMA affinity information - so for example my AMD desktop box has that, but it does not support hotplug capability. I think first your 'hotplug_possible' code needs to be more specific - not just check if SRAT exists, but also if there are swaths of memory that are non-populated. It would also help if there was some indication of whether the box truly does a hardware hotplug - is there a way to do this? > > > > > btw, you need more work to enable switch on the fly. > > > > You need to change the dma_ops pointer (see get_dma_ops()). It means > > that you need to track outstanding dma operations per device, locking, > > etc. > > Yeah though if we are doing this during swiotlb_init time i.e. at bootup > as suggested in the pseudo patch, we don't need to worry about all this, > right ? Right.. > > Thanks, > Alok -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 23 Jul 2010 10:40 > I thought SRAT has NUMA affinity information - so for example my AMD > desktop box has that, but it does not support hotplug capability. > > I think first your 'hotplug_possible' code needs to be more specific - > not just check if SRAT exists, but also if there are swaths of memory > that are non-populated. It would also help if there was some indication > of whether the box truly does a hardware hotplug - is there a way to do > this? The SRAT declares hotplug memory ranges in advance. And Linux already uses this information in the SRAT parser (just the code for doing this is a bit dumb, I have a rewrite somewhere) The only drawback is that some older systems claimed to have large hotplug memory ranges when they didn't actually support it. So it's better to not do anything with a lot of overhead. So yes it would be reasonable to let swiotlb (and possibly other code sizing itself based on memory) call into the SRAT parser and check the hotplug ranges too. BTW longer term swiotlb should be really more dynamic anyways and grow and shrink on demand. I attempted this some time ago with my DMA allocator patchkit, unfortunately that didn't go forward. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on 23 Jul 2010 11:10 On Fri, Jul 23, 2010 at 04:33:32PM +0200, Andi Kleen wrote: > > >I thought SRAT has NUMA affinity information - so for example my AMD > >desktop box has that, but it does not support hotplug capability. > > > >I think first your 'hotplug_possible' code needs to be more specific - > >not just check if SRAT exists, but also if there are swaths of memory > >that are non-populated. It would also help if there was some indication > >of whether the box truly does a hardware hotplug - is there a way to do > >this? > > The SRAT declares hotplug memory ranges in advance. And Linux > already uses this > information in the SRAT parser (just the code for doing this is a > bit dumb, I have a rewrite > somewhere) > > The only drawback is that some older systems claimed to have large > hotplug memory ranges > when they didn't actually support it. So it's better to not do > anything with a lot > of overhead. > > So yes it would be reasonable to let swiotlb (and possibly other > code sizing itself > based on memory) call into the SRAT parser and check the hotplug ranges too. > > BTW longer term swiotlb should be really more dynamic anyways and grow > and shrink on demand. I attempted this some time ago with my DMA I was thinking about this at some point. I think the first step is to make SWIOTLB use the debugfs to actually print out how much of its buffers are used - and see if the 64MB is a good fit. The shrinking part scares me - I think it might be more prudent to first explore on how to grow it. The big problem looks to allocate a physical contiguity set of pages. And I guess SWIOTLB would need to change from using one big region to something of a pool system? > allocator patchkit, > unfortunately that didn't go forward. I wasn't present at that time so I don't know what the issues were - you wouldn't have a link to LKML for this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 23 Jul 2010 11:30
> I was thinking about this at some point. I think the first step is to > make SWIOTLB use the debugfs to actually print out how much of its > buffers are used - and see if the 64MB is a good fit. swiotlb is near always wrongly sized. For most system it's far too much, but for some not enough. I have some systemtap scripts around to instrument it. Also it depends on the IO load, so if you size it reasonable you risk overflow on large IO (however these days this very rarely happens because all "serious" IO devices don't need swiotlb anymore) The other problem is that using only two bits for the needed address space is also extremly inefficient (4GB and 16MB on x86). Really want masks everywhere and optimize for the actual requirements. > The shrinking part scares me - I think it might be more prudent to first > explore on how to grow it. The big problem looks to allocate a physical > contiguity set of pages. And I guess SWIOTLB would need to change from > using one big region to something of a pool system? > Shrinking: you define a movable zone, so with some delay it can be always freed. The problem with swiotlb is however it still cannot block, but it can adapt to load. The real fix would be blockable swiotlb, but the way drivers are set up this is difficult (at least in kernels using spinlocks) >> allocator patchkit, >> unfortunately that didn't go forward. > I wasn't present at that time so I don't know what the issues were - you > wouldn't have a link to LKML for this? There wasn't all that much opposition, but I ran out of time because fixing the infrastructure (e.g. getting rid of all of GFP_DMA) is a lot of work. In a sense it's a big tree sweep project like getting rid of BKL. The old patch kit is at ftp://firstfloor.org/pub/ak/dma/ "intro" has the rationale. I have a slightly newer version of the SCSI & misc drivers patchkit somewhere. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |