Prev: infiniband: ulp/iser, fix error retval in iser_create_ib_conn_res
Next: sched_wakeup_new and sched_kthread_stop events cause great overload
From: Tom Lyon on 1 Apr 2010 11:50 On Thursday 01 April 2010 02:09:09 am Avi Kivity wrote: > On 04/01/2010 03:08 AM, Tom Lyon wrote: > > uio_pci_generic has previously been discussed on the KVM list, but this > > patch has nothing to do with KVM, so it is also going to LKML. > > (needs to go to lkml even if it was for kvm) > > > The point of this patch is to beef up the uio_pci_generic driver so that > > a non-privileged user process can run a user level driver for most PCIe > > devices. This can only be safe if there is an IOMMU in the system with > > per-device domains. Privileged users (CAP_SYS_RAWIO) are allowed if > > there is no IOMMU. > > > > Specifically, I seek to allow low-latency user level network drivers (non > > tcp/ip) which directly access SR-IOV style virtual network adapters, for > > use with packages such as OpenMPI. > > > > Key areas of change: > > - ioctl extensions to allow registration and dma mapping of memory > > regions, with lock accounting > > - support for mmu notifier driven de-mapping > > Note that current iommus/devices don't support restart-on-fault dma, so > userspace drivers will have to lock memory so that it is not swapped > out. I don't think this prevents page migration, though. The driver provides a way to lock memory for DMA; the mmu notifier support is to catch things when the user accidentally frees locked pages. > > - support for MSI and MSI-X interrupts (the intel 82599 VFs support only > > MSI-X) > > How does a userspace program receive those interrupts? Same as other UIO drivers - by read()ing an event counter. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tom Lyon on 1 Apr 2010 11:50 On Thursday 01 April 2010 05:52:18 am Joerg Roedel wrote: > On Wed, Mar 31, 2010 at 05:08:38PM -0700, Tom Lyon wrote: > > uio_pci_generic has previously been discussed on the KVM list, but this > > patch has nothing to do with KVM, so it is also going to LKML. > > But since you send it to the KVM list it should be suitable for KVM too, > no? I know not. > > > The point of this patch is to beef up the uio_pci_generic driver so that > > a non-privileged user process can run a user level driver for most PCIe > > devices. This can only be safe if there is an IOMMU in the system with > > per-device domains. Privileged users (CAP_SYS_RAWIO) are allowed if > > there is no IOMMU. > > If you rely on an IOMMU you can use the IOMMU-API instead of the DMA-API > for dma mappings. This change makes this driver suitable for KVM use > too. If the interface is designed clever enough we can even use it for > IOMMU emulation for pass-through devices. The use with privileged processes and no IOMMUs is still quite useful, so I'd rather stick with the DMA interface. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tom Lyon on 1 Apr 2010 12:10 On Thursday 01 April 2010 08:54:14 am Avi Kivity wrote: > On 04/01/2010 06:39 PM, Tom Lyon wrote: > >>> - support for MSI and MSI-X interrupts (the intel 82599 VFs support > >>> only MSI-X) > >> > >> How does a userspace program receive those interrupts? > > > > Same as other UIO drivers - by read()ing an event counter. > > IIRC the usual event counter is /dev/uioX, what's your event counter now? Exact same mechanism. > > kvm really wants the event counter to be an eventfd, that allows hooking > it directly to kvm (which can inject an interrupt on an eventfd_signal), > can you adapt your patch to do this? My patch does not currently go anywhere near the read/fd logic of /dev/uioX. I think a separate patch would be appropriate. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tom Lyon on 1 Apr 2010 12:10 On Thursday 01 April 2010 07:25:04 am Michael S. Tsirkin wrote: > On Wed, Mar 31, 2010 at 05:08:38PM -0700, Tom Lyon wrote: > > uio_pci_generic has previously been discussed on the KVM list, but this > > patch has nothing to do with KVM, so it is also going to LKML. > > > > The point of this patch is to beef up the uio_pci_generic driver so that > > a non-privileged user process can run a user level driver for most PCIe > > devices. This can only be safe if there is an IOMMU in the system with > > per-device domains. > > Why? Per-guest domain should be safe enough. I'm not sure what 'per-guest' means in an ordinary process context. > > > Privileged users (CAP_SYS_RAWIO) are allowed if there is > > no IOMMU. > > qemu does not support it, I doubt this last option is worth having. This is extremely useful in non IOMMU systems - again, we're talking ordinary processes, nothing to do with VMs. As long as the program can be trusted, e.g., in embedded apps. > > > Specifically, I seek to allow low-latency user level network drivers (non > > tcp/ip) which directly access SR-IOV style virtual network adapters, for > > use with packages such as OpenMPI. > > > > Key areas of change: > > - ioctl extensions to allow registration and dma mapping of memory > > regions, with lock accounting > > - support for mmu notifier driven de-mapping > > - support for MSI and MSI-X interrupts (the intel 82599 VFs support only > > MSI-X) > > - allowing interrupt enabling and device register mapping all > > through /dev/uio* so that permissions may be granted just by chmod > > on /dev/uio* > > For non-priveledged users, we need a way to enforce that > device is bound to an iommu. Right now I just use iommu_found - assuming that if we have one, it is in use. Something better would be nice. > Further, locking really needs to be scoped with iommu domain existance > and with iommu mappings: as long as a page is mapped in iommu, > it must be locked. This patch does not seem to enforce that. Sure it does. The DMA API - get_user_pages and dma_map_sg lock pages into the MMU and the IOMMU. The MMU notifier unlocks if the user forgets to do it explicitly. > Also note that what we really want is a single iommu domain per guest, > not per device. For my networking applications, I will need the ability to talk to multiple devices on potentially separate IOMMUs. What would per-guest mean then? > > For this reason, I think we should address the problem somwwhat > differently: > - Create a character device to represent the iommu > - This device will handle memory locking etc > - Allow binding this device to iommu > - Allow other operations only after iommu is bound There are still per-device issues with locking - in particular the size of the device's DMA address space. The DMA API already handles this - why not use it? It would be nice to have a way to test whether a device is truly covered by an IOMMU, but today it appears that if an IOMMU exists, then it covers all devices (at least as far as I can see for x86). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tom Lyon on 1 Apr 2010 15:30
On Thursday 01 April 2010 09:10:57 am Avi Kivity wrote: > On 04/01/2010 07:06 PM, Tom Lyon wrote: > > On Thursday 01 April 2010 08:54:14 am Avi Kivity wrote: > >> On 04/01/2010 06:39 PM, Tom Lyon wrote: > >>>>> - support for MSI and MSI-X interrupts (the intel 82599 VFs support > >>>>> only MSI-X) > >>>> > >>>> How does a userspace program receive those interrupts? > >>> > >>> Same as other UIO drivers - by read()ing an event counter. > >> > >> IIRC the usual event counter is /dev/uioX, what's your event counter > >> now? > > > > Exact same mechanism. > > But there are multiple msi-x interrupts, how do you know which one > triggered? You don't. This would suck for KVM, I guess, but we'd need major rework of the generic UIO stuff to have a separate event channel for each MSI-X. For my purposes, collapsing all the MSI-Xs into one MSI-look-alike is fine, because I'd be using MSI anyways if I could. The weird Intel 82599 VF only supports MSI-X. So one big question is - do we expand the whole UIO framework for KVM requirements, or do we split off either KVM or non-VM into a separate driver? Hans or Greg - care to opine? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |