From: Avi Kivity on 2 Jun 2010 07:30 On 06/02/2010 01:04 PM, Joerg Roedel wrote: > On Wed, Jun 02, 2010 at 12:49:28PM +0300, Avi Kivity wrote: > >> On 06/02/2010 12:45 PM, Joerg Roedel wrote: >> >>> IOMMU mapped memory can not be swapped out because we can't do demand >>> paging on io-page-faults with current devices. We have to pin _all_ >>> userspace memory that is mapped into an IOMMU domain. >>> >> vhost doesn't pin memory. >> >> What I proposed is to describe the memory map using an object (fd), and >> pass it around to clients that use it: kvm, vhost, vfio. That way you >> maintain the memory map in a central location and broadcast changes to >> clients. Only a vfio client would result in memory being pinned. >> > Ah ok, so its only about the database which keeps the mapping > information. > Yes. > >> It can still work, but the interface needs to be extended to include >> dirty bitmap logging. >> > Thats hard to do. I am not sure about VT-d but the AMD IOMMU has no > dirty-bits in the page-table. And without demand-paging we can't really > tell what pages a device has written to. The only choice is to mark all > IOMMU-mapped pages dirty as long as they are mapped. > > The interface would only work for clients which support it: kvm, vhost, and iommu/devices with restartable dma. Note dirty logging is not very interesting for vfio anyway, since you can't live migrate with assigned devices. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 08:20 On Wed, Jun 02, 2010 at 02:21:00PM +0300, Michael S. Tsirkin wrote: > On Wed, Jun 02, 2010 at 01:12:25PM +0200, Joerg Roedel wrote: > > Even if it is bound to a domain the userspace driver could program the > > device to do dma to unmapped regions causing io-page-faults. The kernel > > can't do anything about it. > > It can always corrupt its own memory directly as well :) > But that is not a reason not to detect errors if we can, > and not to make APIs hard to misuse. Changing the domain of a device while dma can happen is the same type of bug as unmapping potential dma target addresses. We can't catch this kind of misuse. > > > With 10 devices you have 10 extra ioctls. > > > > And this works implicitly with your proposal? > > Yes. so you do: > iommu = open > ioctl(dev1, BIND, iommu) > ioctl(dev2, BIND, iommu) > ioctl(dev3, BIND, iommu) > ioctl(dev4, BIND, iommu) > > No need to add a SHARE ioctl. In my proposal this looks like: dev1 = open(); ioctl(dev2, SHARE, dev1); ioctl(dev3, SHARE, dev1); ioctl(dev4, SHARE, dev1); So we actually save an ioctl. > > Remember that we still need to be able to provide seperate mappings > > for each device to support IOMMU emulation for the guest. > > Generally not true. E.g. guest can enable iommu passthrough > or have domain per a group of devices. What I meant was that there may me multiple io-addresses spaces necessary for one process. I didn't want to say that every device _needs_ to have its own address space. > > As I wrote the domain has a reference count and is destroyed only when > > it goes down to zero. This does not happen as long as a device is bound > > to it. > > > > Joerg > > We were talking about UNSHARE ioctl: > ioctl(dev1, UNSHARE, dev2) > Does it change the domain for dev1 or dev2? > If you make a mistake you get a hard to debug bug. As I already wrote we would have an UNBIND ioctl which just removes a device from its current domain. UNBIND is better than UNSHARE for exactly the reason you pointed out above. I thought I stated that already. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 2 Jun 2010 08:30 On 06/02/2010 03:19 PM, Joerg Roedel wrote: > >> Yes. so you do: >> iommu = open >> ioctl(dev1, BIND, iommu) >> ioctl(dev2, BIND, iommu) >> ioctl(dev3, BIND, iommu) >> ioctl(dev4, BIND, iommu) >> >> No need to add a SHARE ioctl. >> > In my proposal this looks like: > > > dev1 = open(); > ioctl(dev2, SHARE, dev1); > ioctl(dev3, SHARE, dev1); > ioctl(dev4, SHARE, dev1); > > So we actually save an ioctl. > The problem with this is that it is assymetric, dev1 is treated differently from dev[234]. It's an unintuitive API. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on 2 Jun 2010 08:40 On Wed, Jun 02, 2010 at 02:19:28PM +0200, Joerg Roedel wrote: > On Wed, Jun 02, 2010 at 02:21:00PM +0300, Michael S. Tsirkin wrote: > > On Wed, Jun 02, 2010 at 01:12:25PM +0200, Joerg Roedel wrote: > > > > Even if it is bound to a domain the userspace driver could program the > > > device to do dma to unmapped regions causing io-page-faults. The kernel > > > can't do anything about it. > > > > It can always corrupt its own memory directly as well :) > > But that is not a reason not to detect errors if we can, > > and not to make APIs hard to misuse. > > Changing the domain of a device while dma can happen is the same type of > bug as unmapping potential dma target addresses. We can't catch this > kind of misuse. you normally need device mapped to start DMA. SHARE makes this bug more likely as you allow switching domains: mmap could be done before switching. > > > > With 10 devices you have 10 extra ioctls. > > > > > > And this works implicitly with your proposal? > > > > Yes. so you do: > > iommu = open > > ioctl(dev1, BIND, iommu) > > ioctl(dev2, BIND, iommu) > > ioctl(dev3, BIND, iommu) > > ioctl(dev4, BIND, iommu) > > > > No need to add a SHARE ioctl. > > In my proposal this looks like: > > > dev1 = open(); > ioctl(dev2, SHARE, dev1); > ioctl(dev3, SHARE, dev1); > ioctl(dev4, SHARE, dev1); > > So we actually save an ioctl. I thought we had a BIND ioctl? > > > Remember that we still need to be able to provide seperate mappings > > > for each device to support IOMMU emulation for the guest. > > > > Generally not true. E.g. guest can enable iommu passthrough > > or have domain per a group of devices. > > What I meant was that there may me multiple io-addresses spaces > necessary for one process. I didn't want to say that every device > _needs_ to have its own address space. > > > > As I wrote the domain has a reference count and is destroyed only when > > > it goes down to zero. This does not happen as long as a device is bound > > > to it. > > > > > > Joerg > > > > We were talking about UNSHARE ioctl: > > ioctl(dev1, UNSHARE, dev2) > > Does it change the domain for dev1 or dev2? > > If you make a mistake you get a hard to debug bug. > > As I already wrote we would have an UNBIND ioctl which just removes a > device from its current domain. UNBIND is better than UNSHARE for > exactly the reason you pointed out above. I thought I stated that > already. > > Joerg You undo SHARE with UNBIND? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 09:00
On Wed, Jun 02, 2010 at 03:25:11PM +0300, Avi Kivity wrote: > On 06/02/2010 03:19 PM, Joerg Roedel wrote: >> >>> Yes. so you do: >>> iommu = open >>> ioctl(dev1, BIND, iommu) >>> ioctl(dev2, BIND, iommu) >>> ioctl(dev3, BIND, iommu) >>> ioctl(dev4, BIND, iommu) >>> >>> No need to add a SHARE ioctl. >>> >> In my proposal this looks like: >> >> >> dev1 = open(); >> ioctl(dev2, SHARE, dev1); >> ioctl(dev3, SHARE, dev1); >> ioctl(dev4, SHARE, dev1); >> >> So we actually save an ioctl. >> > > The problem with this is that it is assymetric, dev1 is treated > differently from dev[234]. It's an unintuitive API. Its by far more unintuitive that a process needs to explicitly bind a device to an iommu domain before it can do anything with it. If its required anyway the binding can happen implicitly. We could allow to do a nop 'ioctl(dev1, SHARE, dev1)' to remove the asymmetry. Note that this way of handling userspace iommu mappings is also a lot simpler for most use-cases outside of KVM. If a developer wants to write a userspace driver all it needs to do is: dev = open(); ioctl(dev, MAP, ...); /* use device with mappings */ close(dev); Which is much easier than the need to create a domain explicitly. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |