Prev: + tmpfs-fix-oops-on-remounts-with-mpol=default.patch added to -mm tree
Next: [PATCH 5/5] doc: add the documentation for mpol=local
From: Anthony Liguori on 16 Mar 2010 13:40 On 03/16/2010 10:52 AM, Ingo Molnar wrote: > You are quite mistaken: KVM isnt really a 'random unprivileged application' in > this context, it is clearly an extension of system/kernel services. > > ( Which can be seen from the simple fact that what started the discussion was > 'how do we get /proc/kallsyms from the guest'. I.e. an extension of the > existing host-space /proc/kallsyms was desired. ) > Random tools (like perf) should not be able to do what you describe. It's a security nightmare. If it's desirable to have /proc/kallsyms available, we can expose an interface in QEMU to provide that. That can then be plumbed through libvirt and QMP. Then a management tool can use libvirt or QMP to obtain that information and interact with the kernel appropriately. > In that sense the most natural 'extension' would be the solution i mentioned a > week or two ago: to have a (read only) mount of all guest filesystems, plus a > channel for profiling/tracing data. That would make symbol parsing easier and > it's what extends the existing 'host space' abstraction in the most natural > way. > > ( It doesnt even have to be done via the kernel - Qemu could implement that > via FUSE for example. ) > No way. The guest has sensitive data and exposing it widely on the host is a bad thing to do. It's a bad interface. We can expose specific information about guests but only through our existing channels which are validated through a security infrastructure. Ultimately, your goal is to keep perf a simple tool with little dependencies. But practically speaking, if you want to add features to it, it's going to have to interact with other subsystems in the appropriate way. That means, it's going to need to interact with libvirt or QMP. If you want all applications to expose their data via synthetic file systems, then there's always plan9 :-) Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 16 Mar 2010 13:50 * Anthony Liguori <anthony(a)codemonkey.ws> wrote: > On 03/16/2010 08:08 AM, Ingo Molnar wrote: > >* Avi Kivity<avi(a)redhat.com> wrote: > > > >>On 03/16/2010 02:29 PM, Ingo Molnar wrote: > >>>I mean, i can trust a kernel service and i can trust /proc/kallsyms. > >>> > >>>Can perf trust a random process claiming to be Qemu? What's the trust > >>>mechanism here? > >>Obviously you can't trust anything you get from a guest, no matter how you > >>get it. > >I'm not talking about the symbol strings and addresses, and the object > >contents for allocation (or debuginfo). I'm talking about the basic protocol > >of establishing which guest is which. > > > >I.e. we really want to be able users to: > > > > 1) have it all working with a single guest, without having to specify 'which' > > guest (qemu PID) to work with. That is the dominant usecase both for > > developers and for a fair portion of testers. > > You're making too many assumptions. > > There is no list of guests anymore than there is a list of web browsers. > > You can have a multi-tenant scenario where you have distinct groups of > virtual machines running as unprivileged users. "multi-tenant" and groups is not a valid excuse at all for giving crappy technology in the simplest case: when there's a single VM. Yes, eventually it can be supported and any sane scheme will naturally support it too, but it's by no means what we care about primarily when it comes to these tools. I thought everyone learned the lesson behind SystemTap's failure (and to a certain degree this was behind Oprofile's failure as well): when it comes to tooling/instrumentation we dont want to concentrate on the fancy complex setups and abstract requirements drawn up by CIOs, as development isnt being done there. Concentrate on our developers today, and provide no-compromises usability to those who contribute stuff. If we dont help make the simplest (and most common) use-case convenient then we are failing on a fundamental level. > > 2) Have some reasonable symbolic identification for guests. For example a > > usable approach would be to have 'perf kvm list', which would list all > > currently active guests: > > > > $ perf kvm list > > [1] Fedora > > [2] OpenSuse > > [3] Windows-XP > > [4] Windows-7 > > > > And from that point on 'perf kvm -g OpenSuse record' would do the obvious > > thing. Users will be able to just use the 'OpenSuse' symbolic name for > > that guest, even if the guest got restarted and switched its main PID. > > Does "perf kvm list" always run as root? What if two unprivileged users > both have a VM named "Fedora"? Again, the single-VM case is the most important case, by far. If you have multiple VMs running and want to develop the kernel on multiple VMs (sounds rather messy if you think it through ...), what would happen is similar to what happens when we have two probes for example: # perf probe schedule Added new event: probe:schedule (on schedule+0) You can now use it on all perf tools, such as: perf record -e probe:schedule -a sleep 1 # perf probe -f schedule Added new event: probe:schedule_1 (on schedule+0) You can now use it on all perf tools, such as: perf record -e probe:schedule_1 -a sleep 1 # perf probe -f schedule Added new event: probe:schedule_2 (on schedule+0) You can now use it on all perf tools, such as: perf record -e probe:schedule_2 -a sleep 1 Something similar could be used for KVM/Qemu: whichever got created first is named 'Fedora', the second is named 'Fedora-2'. > If we look at the use-case, it's going to be something like, a user is > creating virtual machines and wants to get performance information about > them. > > Having to run a separate tool like perf is not going to be what they would > expect they had to do. Instead, they would either use their existing GUI > tool (like virt-manager) or they would use their management interface > (either QMP or libvirt). > > The complexity of interaction is due to the fact that perf shouldn't be a > stand alone tool. It should be a library or something with a programmatic > interface that another tool can make use of. But ... a GUI interface/integration is of course possible too, and it's being worked on. perf is mainly a kernel developer tool, and kernel developers generally dont use GUIs to do their stuff: which is the (sole) reason why its first ~850 commits of tools/perf/ were done without a GUI. We go where our developers are. In any case it's not an excuse to have no proper command-line tooling. In fact if you cannot get simpler, more atomic command-line tooling right then you'll probably doubly suck at doing a GUI as well. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 16 Mar 2010 14:00 * Anthony Liguori <aliguori(a)linux.vnet.ibm.com> wrote: > On 03/16/2010 10:52 AM, Ingo Molnar wrote: > >You are quite mistaken: KVM isnt really a 'random unprivileged application' in > >this context, it is clearly an extension of system/kernel services. > > > >( Which can be seen from the simple fact that what started the discussion was > > 'how do we get /proc/kallsyms from the guest'. I.e. an extension of the > > existing host-space /proc/kallsyms was desired. ) > > Random tools (like perf) should not be able to do what you describe. It's a > security nightmare. A security nightmare exactly how? Mind to go into details as i dont understand your point. > If it's desirable to have /proc/kallsyms available, we can expose an > interface in QEMU to provide that. That can then be plumbed through libvirt > and QMP. > > Then a management tool can use libvirt or QMP to obtain that information and > interact with the kernel appropriately. > > > In that sense the most natural 'extension' would be the solution i > > mentioned a week or two ago: to have a (read only) mount of all guest > > filesystems, plus a channel for profiling/tracing data. That would make > > symbol parsing easier and it's what extends the existing 'host space' > > abstraction in the most natural way. > > > > ( It doesnt even have to be done via the kernel - Qemu could implement that > > via FUSE for example. ) > > No way. The guest has sensitive data and exposing it widely on the host is > a bad thing to do. [...] Firstly, you are putting words into my mouth, as i said nothing about 'exposing it widely'. I suggest exposing it under the privileges of whoever has access to the guest image. Secondly, regarding confidentiality, and this is guest security 101: whoever can access the image on the host _already_ has access to all the guest data! A Linux image can generally be loopback mounted straight away: losetup -o 32256 /dev/loop0 ./guest-image.img mount -o ro /dev/loop0 /mnt-guest (Or, if you are an unprivileged user who cannot mount, it can be read via ext2 tools.) There's nothing the guest can do about that. The host is in total control of guest image data for heaven's sake! All i'm suggesting is to make what is already possible more convenient. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Anthony Liguori on 16 Mar 2010 14:10 On 03/16/2010 12:52 PM, Ingo Molnar wrote: > * Anthony Liguori<aliguori(a)linux.vnet.ibm.com> wrote: > > >> On 03/16/2010 10:52 AM, Ingo Molnar wrote: >> >>> You are quite mistaken: KVM isnt really a 'random unprivileged application' in >>> this context, it is clearly an extension of system/kernel services. >>> >>> ( Which can be seen from the simple fact that what started the discussion was >>> 'how do we get /proc/kallsyms from the guest'. I.e. an extension of the >>> existing host-space /proc/kallsyms was desired. ) >>> >> Random tools (like perf) should not be able to do what you describe. It's a >> security nightmare. >> > A security nightmare exactly how? Mind to go into details as i dont understand > your point. > Assume you're using SELinux to implement mandatory access control. How do you label this file system? Generally speaking, we don't know the difference between /proc/kallsyms vs. /dev/mem if we do generic passthrough. While it might be safe to have a relaxed label of kallsyms (since it's read only), it's clearly not safe to do that for /dev/mem, /etc/shadow, or any file containing sensitive information. Rather, we ought to expose a higher level interface that we have more confidence in with respect to understanding the ramifications of exposing that guest data. >> >> No way. The guest has sensitive data and exposing it widely on the host is >> a bad thing to do. [...] >> > Firstly, you are putting words into my mouth, as i said nothing about > 'exposing it widely'. I suggest exposing it under the privileges of whoever > has access to the guest image. > That doesn't work as nicely with SELinux. It's completely reasonable to have a user that can interact in a read only mode with a VM via libvirt but cannot read the guest's disk images or the guest's memory contents. > Secondly, regarding confidentiality, and this is guest security 101: whoever > can access the image on the host _already_ has access to all the guest data! > > A Linux image can generally be loopback mounted straight away: > > losetup -o 32256 /dev/loop0 ./guest-image.img > mount -o ro /dev/loop0 /mnt-guest > > (Or, if you are an unprivileged user who cannot mount, it can be read via ext2 > tools.) > > There's nothing the guest can do about that. The host is in total control of > guest image data for heaven's sake! > It's not that simple in a MAC environment. Regards, Anthony Liguori > All i'm suggesting is to make what is already possible more convenient. > > Ingo > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 16 Mar 2010 14:30
* Anthony Liguori <aliguori(a)linux.vnet.ibm.com> wrote: > On 03/16/2010 12:52 PM, Ingo Molnar wrote: > >* Anthony Liguori<aliguori(a)linux.vnet.ibm.com> wrote: > > > >>On 03/16/2010 10:52 AM, Ingo Molnar wrote: > >>>You are quite mistaken: KVM isnt really a 'random unprivileged application' in > >>>this context, it is clearly an extension of system/kernel services. > >>> > >>>( Which can be seen from the simple fact that what started the discussion was > >>> 'how do we get /proc/kallsyms from the guest'. I.e. an extension of the > >>> existing host-space /proc/kallsyms was desired. ) > >>Random tools (like perf) should not be able to do what you describe. It's a > >>security nightmare. > >A security nightmare exactly how? Mind to go into details as i dont understand > >your point. > > Assume you're using SELinux to implement mandatory access control. > How do you label this file system? > > Generally speaking, we don't know the difference between /proc/kallsyms vs. > /dev/mem if we do generic passthrough. While it might be safe to have a > relaxed label of kallsyms (since it's read only), it's clearly not safe to > do that for /dev/mem, /etc/shadow, or any file containing sensitive > information. What's your _point_? Please outline a threat model, a vector of attack, _anything_ that substantiates your "it's a security nightmare" claim. > Rather, we ought to expose a higher level interface that we have more > confidence in with respect to understanding the ramifications of exposing > that guest data. Exactly, we want something that has a flexible namespace and works well with Linux tools in general. Preferably that namespace should be human readable, and it should be hierarchic, and it should have a well-known permission model. This concept exists in Linux and is generally called a 'filesystem'. > >> No way. The guest has sensitive data and exposing it widely on the host > >> is a bad thing to do. [...] > > > > Firstly, you are putting words into my mouth, as i said nothing about > > 'exposing it widely'. I suggest exposing it under the privileges of > > whoever has access to the guest image. > > That doesn't work as nicely with SELinux. > > It's completely reasonable to have a user that can interact in a read only > mode with a VM via libvirt but cannot read the guest's disk images or the > guest's memory contents. If a user cannot read the image file then the user has no access to its contents via other namespaces either. That is, of course, a basic security aspect. ( That is perfectly true with a non-SELinux Unix permission model as well, and is true in the SELinux case as well. ) > > Secondly, regarding confidentiality, and this is guest security 101: whoever > > can access the image on the host _already_ has access to all the guest data! > > > > A Linux image can generally be loopback mounted straight away: > > > > losetup -o 32256 /dev/loop0 ./guest-image.img > > mount -o ro /dev/loop0 /mnt-guest > > > >(Or, if you are an unprivileged user who cannot mount, it can be read via ext2 > >tools.) > > > > There's nothing the guest can do about that. The host is in total control of > > guest image data for heaven's sake! > > It's not that simple in a MAC environment. Erm. Please explain to me, what exactly is 'not that simple' in a MAC environment? Also, i'd like to note that the 'restrictive SELinux setups' usecases are pretty secondary. To demonstrate that, i'd like every KVM developer on this list who reads this mail and who has their home development system where they produce their patches set up in a restrictive MAC environment, in that you cannot even read the images you are using, to chime in with a "I'm doing that" reply. If there's just a _single_ KVM developer amongst dozens and dozens of developers on this list who develops in an environment like that i'd be surprised. That result should pretty much tell you where the weight of instrumentation focus should lie - and it isnt on restrictive MAC environments .... Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |