Prev: [GIT PULL] SLAB updates for 2.6.34-rc1
Next: [PATCH 1/1] perf: add support for arch-dependent symbolic event names to "perf stat"
From: Eric Anholt on 5 Mar 2010 14:40 On Fri, 5 Mar 2010 12:21:29 +0000, Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote: > Serious discussion point perhaps should be: is the libdrm so close to the > kernel it ought to be in the same git tree ? Alternatively does it need > to be easier to have multiple Nouveau libdrms autoselected according to > the kernel side versioning. ELF library versioning is not rocket science > and both the old and new libraries exist and can be installed so all the > bits are present except for the wrapper to load the right sublibrary yes ? That *would* make versioning impossible. To make the difficulty of improving ABI at the moment concrete, I just got done merging the patches for execbuf2 in userland and enabling i915 texture tiling. This was a 3% performance win in one test I was looking at, and 1% in another -- less than hoped, but important nonetheless (there are other cases that should see 30% or so wins hopefully). The patches got written back in July, and revved several times as they broke various combinations of compatibility. At the point that I merged eb2 to the kernel for 2.6.33, it wasn't *really* tested -- the userland side was broken all to hell it looked like, but at least it wasn't regressing execbuf1 any more, right? I spent this week getting userland working, including a new libdrm release (already obsolete because a bug in the libdrm violated what the ABI between libdrm <-> msea was supposed to be). So overall, I'd say that we spent about a month of developer time at least between jbarnes, ickle, and myself, on extending the execbuf interface to add a flag saying "dear kernel, please don't do this bit of work on this buffer, because I don't need it and it makes things slow." This is not that bad for Intel folks. We're paid to hack on it, and can justify spending ridiculous amounts of time for small wins. I actually enjoy this. Right now all the userland -- whether it's Mesa, xf86-video-intel, libva, cairo-drm, stand-alone DRM testcases, etc., gets to move to the new libdrm API, declare its dependency in PKG_CHECK_MODULES, and hand that new flag to libdrm as if the kernel supported the new interface. Inside libdrm, it looks at the kernel version and uses the new interface or old as appropriate. The ugly versioning stuff stays in one easy-to-review 5kloc component, and the complicated 50kloc driver components get to pretend they have a fancy new kernel. But if libdrm's in the kernel, all those userland components no longer get to rely on the version of libdrm, because distros will ship whatever's with the kernel they're using and our userland does have to work on (relatively recent) distros. Each of those userland components would have to grow a compatibility layer to work with whatever kernel libdrm is available, passing the flag in the new API or using the old API. Userland would even buggier for having to replicate all that logic everywhere, and we probably wouldn't have execbuf2 landed yet. Well, OK. What I'd really do instead is make the kernel libdrm be a thin ioctl wrapper, and build a librealdrm that does what libdrm does today. But I don't think that's what you were suggesting.
From: Corbin Simpson on 5 Mar 2010 14:50 On Fri, Mar 5, 2010 at 8:46 AM, <tytso(a)mit.edu> wrote: > On Fri, Mar 05, 2010 at 06:04:34PM +0200, Daniel Stone wrote: >> >> So you're saying that there's no way to develop any reasonable body of >> code for the Linux kernel without committing to keeping your ABI >> absolutely rock-solid stable for eternity, no exceptions, ever? Cool, >> that worked really well for Xlib. > > No, that's not what people are saying. �What people are saying is, > "avoid flag days". �Deprecate things over a 6-12 month time period. > We have lots of really good interfaces for doing that. > > You say you don't want to do that? �Then keep it to your self and > don't get it dropped into popular distributions like Fedora or Ubuntu. > You want a larger pool of testers? �Great! �The price you need to pay > for that is to be able to do some kind of of ABI versioning so that > you don't have "drop dead flag days". > > If you don't want to be a good citizen, then prepared to have people > call you out for, well, not being a good OSS citizen. I was trying my hardest to not say anything, but... Nouveau isn't an official Xorg project. It hasn't been added to the jhbuild list for auto-checkout, it doesn't get tinderbox time (admittedly a function of being part of the jhbuild) and I don't think it's on the katamari list, so it's never been shipped as part of an Xorg release. It is only in mainline under the staging rules; drivers come and go from staging under fairly lax rules. Fedora ships this stuff because they're actively developing it and enjoy deploying half-broken things to users in the vain hope that it magically won't break. I can't count the number of kittens eaten by Fedora systems I've used. (It is kind of sad that Fedora's still the best distro about not deploying broken stuff but still remaining up-to-date.) Tellingly, it doesn't look like this interface change has been deployed to stable Fedora, just Rawhide. The Ubuntu people don't talk to us as much as they should. Seeing how badly they biffed Radeon and Intel KMS deployment, it's hard for me to believe that deploying Nouveau went smoothly. I don't have much more personal experience; my work computer has an HD 3450 in it now instead of the old GeForce, and that's my only Ubuntu box. If distros want to run weird experiments on their users, let them! Sure, sometimes bad things happen, but sometimes good things happen too. ConsoleKit, DeviceKit, HAL, NetworkManager, KMS, yaird, dracut, Plymouth, the list goes on and on. If the problem here is actually that a distro is deploying a staging driver and picking up the pieces themselves, then just say it. This whole thing with flag days, deprecation, interface changes, etc. hinges on the idea that the code being deprecated was stable, usable, and widely deployed, but it wasn't and isn't. That said... Code probably is moving too fast inside nouveau. There is a bit of a wall to go through to get new patches upstream, which one would hope would inspire some developer restraint. intel and radeon both still have most (if not all) of the legacy code needed by ancient userspaces, and both DDX drivers are doing multiple-branch releases to keep old userspace interfaces alive for people unable to update their kernels. It might be useful for the nouveau guys to really seriously consider code before it leaves their trees and enters mainline; writing code that you won't commit to is quite lame for the obvious reasons, but also for some unobvious reasons, e.g. it makes you look like you don't actually know what you're doing and would rather just keep reinventing wheels without justifying and testing your design choices. (This is also why I was not exactly pleased with the suggestion of retooling all of the r600 userspace over a change to the CS system; we just spent the better part of a year moving everything over to CS!) ~ C. -- Only fools are easily impressed by what is only barely beyond their reach. ~ Unknown Corbin Simpson <MostAwesomeDude(a)gmail.com> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Felipe Contreras on 5 Mar 2010 15:40 On Fri, Mar 5, 2010 at 2:41 AM, Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > On Fri, 5 Mar 2010, Ben Skeggs wrote: >> The F13 packages *will* work, so long as you're not bisecting back and >> forth. > > How do I install just the F13 libdrm thing, without changing everything > else? I'm willing to try. We can make it part of the 2.6.34 release notes. > > And if we end up having people bisecting back and forth, I will hate that > f*cking nouveau driver even more. I believe Dave has already explained this to you, but nobody has mentioned it here. What you are supposed to do is install the new nouveau driver, which requires a new libdrm. So, just compile both libdrm, and nouveau, to a sandbox, say /opt/new-nouveau, and then in /etc/X11/xorg.conf: Section "Files" ModulePath "/opt/new-nouveau/lib/xorg/modules" ModulePath "/usr/lib/xorg/modules" EndSection That should do it. No frankensteinian F13 packaging stuff, and no mess with the system's /usr/lib/. Cheers. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Luca Barbieri on 5 Mar 2010 15:40 >�So overall, I'd say that we spent about a month of developer time > at least between jbarnes, ickle, and myself, on extending the execbuf > interface to add a flag saying "dear kernel, please don't do this bit of > work on this buffer, because I don't need it and it makes things slow." Perhaps then, we should break ABI compatibility _more_ often to speed up development, but also have awesome mechanisms to make it painless for the user. Such as: 1. Automatic side by side userspace installation, as Linus proposed 2. Kernel "make install" refusing to proceed if it finds that userspace is not updated, and giving instructions on how to update userspace 3. Distributions packaging the new ABI X/Mesa drivers and libdrm even for stable distributions 4. Kernel "make install" offering to automatically install said distribution packages if it detects a supported distribution 5. Ability to drop new versions of drivers/gpu/drm in an older kernel tree and have it compile (within reasonable limits) In particular, for people with (slightly) old kernels, it should be much easier to make updated DRM trees still work with older kernels, than attempting to make updated userspace work with older kernel modules. This also actually gives them the benefits of the new code. And for people with really old kernels, it's not different from any other hardware device, which requires a kernel upgrade to have better support. Then, for instance, Linus would just have seen the following upon running make install: This kernel requires the Nouveau userspace version 0.0.16, which you don't have installed. Fedora 12 has been detected. Invoke yum to install the <rpmnames> RPMs required for it? [y/n] _or_ Ubuntu 9.10 has been detected Invoke apt-get to install the <debnames> packages required for it? [y/n] If the user says no, or the distribution is unknown, instructions on how to download and compile the source would be presented. Once you setup this system, you can freely break the ABI with no significant user discomfort by just raising the version number. This also potentially applies to stuff other than DRM (e.g. perf, kvm, iptables, udev, filesystem-specific tools/APIs, various device configuration systems, and so on). The really stable APIs/ABIs should not be the (undocumented) kernel ones, but Xlib and OpenGL, which actually have formal specifications. Perhaps eventually Gallium could join them as a stable API closer to the hardware, but that's a long way off. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Felipe Contreras on 5 Mar 2010 16:00
On Fri, Mar 5, 2010 at 6:19 PM, Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > The thing I objected to, in the VERY BEGINNING in this thread, i the fact > that the thing was done in such a way that it's basically impossible to > support the old/new ABI at all! [...] > The way this was done, it's apparently basically impossible for the Fedora > people to push out packaged that support both the old and the new kernel. The reason why the nouveau people wanted to leave the driver in staging is because they wanted to leave open the option of reshuffling the API. The Fedora guys integrated this stuff on their own risk, and linux (because of your pressure), also did. At no point in time nouveau guys agreed to freeze the API. Now they have done precisely what was expected; there's no surprise there. The surprise seems to be that you thought (for some reason), that reshuffling the API wouldn't break the old (or current in F12) user-space code. Now, how exactly do you think that could have been achieved? Even if you have both nouveau_drv-0.0.15.so, and nouveau_drv-0.0.16.so... What piece of could would choose one rather than the other? There has never been such a piece of code. If there was no compatibility code for API re-shuffling, and API re-shuffling was expected, the resulting breakage was doomed to happen. Finally, at least it's possible to compile the radeon driver without support for DRM, so perhaps nouveau (and other drivers), should check the kernel drm version at run-time instead, and fall-back to non-drm mode when the version is not compatible. I think that's a sensible approach, although that might require a considerable amount of code. However, that's something to consider for the future, as your current libdrm/nouveau is not prepared to handle the DRM API re-shuffle that _must_ happen. Cheers. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |