From: Avi Kivity on 2 Jun 2010 01:50 On 06/02/2010 08:29 AM, Chris Wright wrote: > * Avi Kivity (avi(a)redhat.com) wrote: > >> On 06/02/2010 12:26 AM, Tom Lyon wrote: >> >>> I'm not really opposed to multiple devices per domain, but let me point out how I >>> ended up here. First, the driver has two ways of mapping pages, one based on the >>> iommu api and one based on the dma_map_sg api. With the latter, the system >>> already allocates a domain per device and there's no way to control it. This was >>> presumably done to help isolation between drivers. If there are multiple drivers >>> in the user level, do we not want the same isoation to apply to them? >>> >> In the case of kvm, we don't want isolation between devices, because >> that doesn't happen on real hardware. >> > Sure it does. That's exactly what happens when there's an iommu > involved with bare metal. > But we are emulating a machine without an iommu. When we emulate a machine with an iommu, then yes, we'll want to use as many domains as the guest does. >> So if the guest programs >> devices to dma to each other, we want that to succeed. >> > And it will as long as ATS is enabled (this is a basic requirement > for PCIe peer-to-peer traffic to succeed with an iommu involved on > bare metal). > > That's how things currently are, i.e. we put all devices belonging to a > single guest in the same domain. However, it can be useful to put each > device belonging to a guest in a unique domain. Especially as qemu > grows support for iommu emulation, and guest OSes begin to understand > how to use a hw iommu. > Right, we need to keep flexibility. >>> And then there's the fact that it is possible to have multiple disjoint iommus on a system, >>> so it may not even be possible to bring 2 devices under one domain. >>> >> That's indeed a deficiency. >> > Not sure it's a deficiency. Typically to share page table mappings > across multiple iommu's you just have to do update/invalidate to each > hw iommu that is sharing the mapping. Alternatively, you can use more > memory and build/maintain identical mappings (as Tom alludes to below). > Sharing the page tables is just an optimization, I was worried about devices in separate domains not talking to each other. if ATS fixes that, great. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 05:50 On Tue, Jun 01, 2010 at 12:55:32PM +0300, Michael S. Tsirkin wrote: > There seems to be some misunderstanding. The userspace interface > proposed forces a separate domain per device and forces userspace to > repeat iommu programming for each device. We are better off sharing a > domain between devices and programming the iommu once. > > The natural way to do this is to have an iommu driver for programming > iommu. IMO a seperate iommu-userspace driver is a nightmare for a userspace interface. It is just too complicated to use. We can solve the problem of multiple devices-per-domain with an ioctl which allows binding one uio-device to the address-space on another. Thats much simpler. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on 2 Jun 2010 06:00 On Wed, Jun 02, 2010 at 11:42:01AM +0200, Joerg Roedel wrote: > On Tue, Jun 01, 2010 at 12:55:32PM +0300, Michael S. Tsirkin wrote: > > > There seems to be some misunderstanding. The userspace interface > > proposed forces a separate domain per device and forces userspace to > > repeat iommu programming for each device. We are better off sharing a > > domain between devices and programming the iommu once. > > > > The natural way to do this is to have an iommu driver for programming > > iommu. > > IMO a seperate iommu-userspace driver is a nightmare for a userspace > interface. It is just too complicated to use. One advantage would be that we can reuse the uio framework for the devices themselves. So an existing app can just program an iommu for DMA and keep using uio for interrupts and access. > We can solve the problem > of multiple devices-per-domain with an ioctl which allows binding one > uio-device to the address-space on another. This would imply switching an iommu domain for a device while it could potentially be doing DMA. No idea whether this can be done in a safe manner. Forcing iommu assignment to be done as a first step seems much saner. > Thats much simpler. > > Joerg So instead of dev = open(); ioctl(dev, ASSIGN, iommu) mmap and if we for ioctl mmap will fail we have dev = open(); if (ndevices > 0) ioctl(devices[0], ASSIGN, dev) mmap And if we forget ioctl we get errors from device. Seems more complicated to me. There will also always exist the confusion: address space for which device are we modifying? With a separate driver for iommu, we can safely check that binding is done correctly. -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 06:00 On Tue, Jun 01, 2010 at 09:59:40PM -0700, Tom Lyon wrote: > This is just what I was thinking. But rather than a get/set, just use two fds. > > ioctl(vfio_fd1, VFIO_SET_DOMAIN, vfio_fd2); > > This may fail if there are really 2 different IOMMUs, so user code must be > prepared for failure, In addition, this is strictlyupwards compatible with > what is there now, so maybe we can add it later. How can this fail with multiple IOMMUs? This should be handled transparently by the IOMMU driver. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 2 Jun 2010 06:00
On 06/02/2010 12:45 PM, Joerg Roedel wrote: > On Tue, Jun 01, 2010 at 03:41:55PM +0300, Avi Kivity wrote: > >> On 06/01/2010 01:46 PM, Michael S. Tsirkin wrote: >> > >>> Main difference is that vhost works fine with unlocked >>> memory, paging it in on demand. iommu needs to unmap >>> memory when it is swapped out or relocated. >>> >>> >> So you'd just take the memory map and not pin anything. This way you >> can reuse the memory map. >> >> But no, it doesn't handle the dirty bitmap, so no go. >> > IOMMU mapped memory can not be swapped out because we can't do demand > paging on io-page-faults with current devices. We have to pin _all_ > userspace memory that is mapped into an IOMMU domain. > vhost doesn't pin memory. What I proposed is to describe the memory map using an object (fd), and pass it around to clients that use it: kvm, vhost, vfio. That way you maintain the memory map in a central location and broadcast changes to clients. Only a vfio client would result in memory being pinned. It can still work, but the interface needs to be extended to include dirty bitmap logging. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |