Prev: BUG kmalloc-64: Poison overwritten, INFO: Allocated in bdi_alloc_work+0x2b/0x100 age=175 cpu=1 pid=3514
Next: [PATCH 62/72] Blackfin: bf537-stamp: add adp5588 gpio resources
From: Eric Paris on 18 Sep 2009 17:11 On Thu, 2009-09-17 at 22:07 +0200, Andreas Gruenbacher wrote: > From my point of view, "global" events make no sense, and fanotify listeners > should register which directories they are interested in (e.g., include "/", > exclude "/proc"). This takes care of chroots and namespaces as well. While I completely agree that most users don't want global events, the antimalware vendors who today, unprotect and hack the syscall table on their unsuspecting customer's machines to intercept every read, write, open, close, mmap, etc syscall want EXACTLY that. They'd been asking for a way to get this information for quite some time now. The largest vendors in this market have agreed the interface (well, when it was a socket interface that I talked about for so long) should meet their needs. Subtree watching / isn't any different or better, just harder and more complex to implement. You still have to exclude /proc and /sys and everything else. Just like one must with a global listener. Still though, this sounds like an issue for the f_type and f_fsid exclusion syscall I say I'm still not settled on. Not and issue with the basis of fanotify or with the 3 proposed syscalls. Jamie, do you see a problem with what I have been asking for review on or see a problem with extending it moving forward? Linus, do you see the value of 'yet another notification scheme' ? -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Gruenbacher on 18 Sep 2009 18:10 On Friday, 18 September 2009 22:52:08 Eric Paris wrote: > On Thu, 2009-09-17 at 22:07 +0200, Andreas Gruenbacher wrote: > > From my point of view, "global" events make no sense, and fanotify > > listeners should register which directories they are interested in (e.g., > > include "/", exclude "/proc"). This takes care of chroots and namespaces > > as well. > > While I completely agree that most users don't want global events, the > antimalware vendors who today, unprotect and hack the syscall table on > their unsuspecting customer's machines to intercept every read, write, > open, close, mmap, etc syscall want EXACTLY that. I understand that "global" is what those guys get today for lack of a reasonable mechanism, but it's not what anybody can ge given by fanotify: it conflicts with filesystem namespaces. Consider running several "virtual machines" in separate namespaces on the same kernel. With "global" you are forced to run the same global fanotify listeners everywhere; with per-mount-point listeners, you can choose between "global" and something more fine-grained by identifying which vfsmounts you are interested in. (Filesystem namespaces correspond to vfsmount hierarchies.) > [...] You still have to exclude /proc and /sys and everything else. Those are mount points, and so convenient to handle with a per-mount-point mechanism. No additional kernel code needed. > [...] Still though, this sounds like an issue for the f_type and f_fsid > exclusion syscall I say I'm still not settled on. Those are also obsolete with a per-mount-point mechanism. Thanks, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Paris on 18 Sep 2009 23:10 On Sat, 2009-09-19 at 00:00 +0200, Andreas Gruenbacher wrote: > On Friday, 18 September 2009 22:52:08 Eric Paris wrote: > > On Thu, 2009-09-17 at 22:07 +0200, Andreas Gruenbacher wrote: > > > From my point of view, "global" events make no sense, and fanotify > > > listeners should register which directories they are interested in (e.g., > > > include "/", exclude "/proc"). This takes care of chroots and namespaces > > > as well. > > > > While I completely agree that most users don't want global events, the > > antimalware vendors who today, unprotect and hack the syscall table on > > their unsuspecting customer's machines to intercept every read, write, > > open, close, mmap, etc syscall want EXACTLY that. > > I understand that "global" is what those guys get today for lack of a > reasonable mechanism, but it's not what anybody can ge given by fanotify: it > conflicts with filesystem namespaces. > > Consider running several "virtual machines" in separate namespaces on the same > kernel. With "global" you are forced to run the same global fanotify > listeners everywhere; with per-mount-point listeners, you can choose > between "global" and something more fine-grained by identifying which > vfsmounts you are interested in. (Filesystem namespaces correspond to > vfsmount hierarchies.) Let me start by saying I am agreeing I should pursue subtree notification. It's what I think everyone really wants. It's a great idea, and I think you might have a simple way to get close. Clearly these are avenues I'm willing and hoping to pursue. Also I say it again, I believe the interface as proposed (except maybe some of my exclusion stuff) is flexible enough to implement any of these ideas. Does anyone disagree? BUT to solve one of the main problems fanotify is intending to solve it needs a way to be the 'fscking all notifier.' It needs to be the whole damn system. I totally agree that what I have in my tree today (yet unposted) restricting global notification (CAP_SYS_ADMIN) is highly inadequate. If any root task in any namespace could easily hop on out of it's namespace using fanotify, that's a problem. No arguments with me. But there must be a way for fanotify to globally get everything. That's one of the main points of fanotify. It needs to be a fscking all notifier, even of things in a completely detached namespace. AV vendors are going to get it. Their customers our users are going to load kernel modules that do horrible things. These are the realities of the world in which we live. Do we really throw 10's or 100's of thousands of our users under the bus because we don't like the software they are using on philosophical grounds? I'm sure namespace people are calling me an idiot and tell me to stay in my namespace. I want to stay in my namespace for 'most' root users, but I need a way to get a global scanner. I want to know what is the sanest way? And for people who feel it's insane, just don't compile it in. I'll make global listeners a build option. But global listeners is an absolute requirement. I was considering saying you needed cap_sys_admin and you needed current->ns_proxy->mnt_ns == the original init task's mnt_ns. Maybe this isn't a great way to determine if a task should be allowed to use global listeners. Is there a better way to restrict it? Think about your web hosting company. They sell 'cheap' vm's to customers in a private name. The web hosting company want to run an AV scanner that scans every file on the computer, their files, their customer's files, everything. Certainly we don't want the customer to break out of their namespace. So, what is the sanest, even if you hate the idea so much you compile it out, way to let the hosting company get information about files in their customer's detached namespace which not letting their customers get information about each other? -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Gruenbacher on 21 Sep 2009 16:10 On Saturday, 19 September 2009 5:04:31 Eric Paris wrote: > Let me start by saying I am agreeing I should pursue subtree > notification. It's what I think everyone really wants. It's a great > idea, and I think you might have a simple way to get close. Clearly > these are avenues I'm willing and hoping to pursue. Also I say it > again, I believe the interface as proposed (except maybe some of my > exclusion stuff) is flexible enough to implement any of these ideas. > Does anyone disagree? It does seem flexible enough. However, the current interface assumes "global" listeners (the mask argument of fanotify_init): int fanotify_init(int flags, int f_flags, __u64 mask, unsigned int priority); Once subtree support is added, this parameter becomes obsolete. That's pretty broken for a syscall yet to be introduced. > BUT to solve one of the main problems fanotify is intending to solve it > needs a way to be the 'fscking all notifier.' It needs to be the whole > damn system. Think of a system after boot, with a single global namespace. Whatever you access by filename is reachable from the namespace root. At this point, nothing more global exists. A listener can watch the mount points of interest, and everything's fine. What's a bit more tricky is to ensure that this listener will continue to receive all events from whatever else is mounted anywhere, irrespective of namespaces. I think we can get there. By the way, Documentation/filesystems/sharedsubtree.txt describes how filesystem namespaces work. Thanks, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jamie Lokier on 21 Sep 2009 16:30
Andreas Gruenbacher wrote: > On Saturday, 19 September 2009 5:04:31 Eric Paris wrote: > > Let me start by saying I am agreeing I should pursue subtree > > notification. It's what I think everyone really wants. It's a great > > idea, and I think you might have a simple way to get close. Clearly > > these are avenues I'm willing and hoping to pursue. Also I say it > > again, I believe the interface as proposed (except maybe some of my > > exclusion stuff) is flexible enough to implement any of these ideas. > > Does anyone disagree? > > It does seem flexible enough. However, the current interface assumes "global" > listeners (the mask argument of fanotify_init): > > int fanotify_init(int flags, int f_flags, __u64 mask, > unsigned int priority); > > Once subtree support is added, this parameter becomes obsolete. That's pretty > broken for a syscall yet to be introduced. > > > BUT to solve one of the main problems fanotify is intending to solve it > > needs a way to be the 'fscking all notifier.' It needs to be the whole > > damn system. > > Think of a system after boot, with a single global namespace. Whatever you > access by filename is reachable from the namespace root. At this point, > nothing more global exists. A listener can watch the mount points of > interest, and everything's fine. > > What's a bit more tricky is to ensure that this listener will continue to > receive all events from whatever else is mounted anywhere, irrespective of > namespaces. I think we can get there. I think so to, and that'd be a great all round solution. We _have_ to receive mount & umount events to do this. But even inotify-style tracking needs those if it's to be accurate, so it's not an additional burden. It would be logical if fanotify could block and ack those in the same way as it can block and ack other accesses (with the usual filtering rules on which inodes trigger events, and which don't or are cached). As in to prevent: mount --bind innocent .bash_login, but also to ensure it always knows what's mounted when another event occurs. > By the way, Documentation/filesystems/sharedsubtree.txt describes how > filesystem namespaces work. Fortunately, after making a new namespace you can read the mounts in the new namespace from /proc/self/mount* (I think) without having to know anything about the shared subtree rules. So to follow monitoring/checking across all namespaces, it would (I think) be enough to receive a fanotify "new namespace" event, and Ack that event to allow the CLONE_NS to proceed. It's still tricky stuff though. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |