Prev: MIPS: Add base support for Ingenic JZ4740 System-on-a-Chip
Next: [PATCH v4] block: avoid unconditionally freeing previously allocated request_queue
From: Stephen Hemminger on 3 Jun 2010 19:00 On Thu, 3 Jun 2010 15:33:23 -0700 (PDT) Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > > > On Thu, 3 Jun 2010, Linus Torvalds wrote: > > > > > So still a race that shows up with KVM (fast floppy?) and manifests > > > as floppy_ready or reset_interrupt OOPS. > > > > Yes, it's quite possible that the Linux floppy driver is simply broken by > > any floppy device that basically responds immediately to a command with an > > interrupt. And considering how few people use floppies, I do expect that > > driver to get _worse_ rather than better in the future. > > Having looked at that driver some more, I can inf act pretty much > guarantee it. The locking is rather baroque. It has a "floppy_lock", but > that only protects certain small parts. In particular, it looks like the > irq handler and the timers do _not_ take it, and that's where most of the > real work is done. > > And in fact, that does look broken. The interrupt handler really does a > "schedule_work()" to schedule the actual handler outside of irq context, > and I don't see any serialization between the timers that file and the > handler running. > > That driver used to be this state machine that ran entirely from interrupt > context, where one interrupt handler would set the state for the next one > (that's what the "do_floppy" thing is for). But then it became bottom > halves, and now it's using schedule_work() instead - and at the same time, > the _timers_ haven't really changed. Those run in timer context, and can > thus interrupt the work thing. > > It always was a disgusting driver. Now it's just even more so. And yes, > I'm sure it's full of races that are largely hidden by the fact that real > floppy hardware is so slow that you can never hit them. > > Looking too much at that driver will cause PTSD. I have to look away. Thank you for confirming my suspicions. Given the state of destruction there, bug fixing is like playing Jenga. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 3 Jun 2010 19:20 On Thu, 3 Jun 2010, Stephen Hemminger wrote: > > Thank you for confirming my suspicions. Given the state of destruction > there, bug fixing is like playing Jenga. I suspect it's fixable, but it would probably involve a lot of careful moving around of that "floppy_lock" spinlock. Add various asserts to make sure that it's held in all cases, and then for each warning you get, you add the proper spinlock until it's all properly protected. The _original_ protection was just from irqs being atomic (UP, remember), and the block layer queueing happening from irq-safe context. You're still running it on UP, but we've even lost the irq-handler protection (and then later, the bottom-half mutual exclusion). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on 3 Jun 2010 19:20 On Thu, 3 Jun 2010 15:33:23 -0700 (PDT) Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > > > On Thu, 3 Jun 2010, Linus Torvalds wrote: > > > > > So still a race that shows up with KVM (fast floppy?) and manifests > > > as floppy_ready or reset_interrupt OOPS. > > > > Yes, it's quite possible that the Linux floppy driver is simply broken by > > any floppy device that basically responds immediately to a command with an > > interrupt. And considering how few people use floppies, I do expect that > > driver to get _worse_ rather than better in the future. > > Having looked at that driver some more, I can inf act pretty much > guarantee it. The locking is rather baroque. It has a "floppy_lock", but > that only protects certain small parts. In particular, it looks like the > irq handler and the timers do _not_ take it, and that's where most of the > real work is done. > > And in fact, that does look broken. The interrupt handler really does a > "schedule_work()" to schedule the actual handler outside of irq context, > and I don't see any serialization between the timers that file and the > handler running. > > That driver used to be this state machine that ran entirely from interrupt > context, where one interrupt handler would set the state for the next one > (that's what the "do_floppy" thing is for). But then it became bottom > halves, and now it's using schedule_work() instead - and at the same time, > the _timers_ haven't really changed. Those run in timer context, and can > thus interrupt the work thing. > > It always was a disgusting driver. Now it's just even more so. And yes, > I'm sure it's full of races that are largely hidden by the fact that real > floppy hardware is so slow that you can never hit them. > > Looking too much at that driver will cause PTSD. I have to look away. > > Linus Maybe putting all back together in a threaded_irq would be safest. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 3 Jun 2010 19:30 On Thu, 3 Jun 2010, Stephen Hemminger wrote: > > Maybe putting all back together in a threaded_irq would be safest. Yes. That floppy driver could easily be a good case for using those threaded irq's. The problem, of course, is to find somebody motivated enough. The code-base really is pretty dang ugly, and it might be hard to do it incrementally, I think. (And starting from scratch is likely not a great idea either - while _some_ of the ugliness comes from the odd irq-driven state machine code, a lot of it also comes from trying to handle all those floppy formats etc) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Bowler on 4 Jun 2010 11:20
On 15:40 Thu 03 Jun , Linus Torvalds wrote: > Although one comment says it all: > > Cons: I ordered 5. After 45 days 3 of them have failed. Too late to return. > > so apparently you do need to order a lot of them to keep them going ;) I actually still have a real floppy drive in my primary desktop. Bought it new in 2001, and it still worked when I used it (once!) in fall 2008. That being said, it would have been quite frustrating if Linux oopsed when I tried to use this piece of hardware. It was frustrating enough to even find a single disk to put in it :). -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |