Prev: (none)
Next: Itanium support ...
From: Andi Kleen on 17 Feb 2010 05:40 On Wed, Feb 17, 2010 at 01:16:48PM +0300, Nikita V. Youshchenko wrote: > > "Nikita V. Youshchenko" <yoush(a)cs.msu.su> writes: > > > I'm developing a device driver that, in it's ioctl()s, accepts a > > > complex data structure. Before doing it's operation, it performs large > > > number of checks if data is valid. If one of those checks fail, driver > > > returns -EINVAL. > > > > > > Unfortunately this -EINVAL is not really useful. E.g. if a developer, > > > sitting in his IDE and debugging his code, will see ioctl() > > > returning -EINVAL, and will have hard times finding what exactly is > > > wrong. > > > > > > Before inventing driver-specific extended error reporting, I'd like to > > > ask if there is anything more or less generic for this. > > > I believe situation when -Exxx is too weak interface for error > > > reporting is common. > > > > This is a very common problem in Linux unfortunately. I always > > describe that as a the "ed approach to error handling". Instead > > of giving a error message you just give ?. Just ? happens > > to be EINVAL in Linux. > > > > My favourite example of this is the configuration of the networking > > queueing disciplines, which configure complicated data structures and > > algorithms and in many cases have tens of different error conditions > > based on the input parameters -- and they all just report EINVAL. > > > > The standard way (standard kludge or standard workaround would be a > > better description) is to use printk; often guarded by a special > > kernel tunable or ifdef to avoid flooding the log in the normal case. > > > > IMHO it would be best to simply add a way to return strings directly > > in this case (a la plan9). This would be probably not too hard to > > implement. It's not there unfortunately. > > > > This could be done with one of the message oriented protocols, > > e.g. netlink or read/write on a special minor. > > Why not create a generic solution for this, if one does not exist yet? Someone would need to do it. Yes I think it would be a worthy project. The trick is also get around the objections of the "but we always did it this way" Unix traditionalists. > > For example, have a "last error" string associated with task_struct, that: > - will clean on each syscall entry, > - while syscall is running, may be filled with printf-style routines, > - may be accessible from userspace with additional syscall [that obviously > should not reset error]? > > This will give driver writers a common interface for extended error > reporting... You would need a way to save/restore that string too (like it works with errno) otherwise libraries cannot use it safely. Also it would be good to have something that does not impact the system call fast path for a non error call. From the basic semantics I think I would prefer a way associated with each syscall. It could be probably fit into many syscall ABIs, but that would need architecture specific changes, which are difficult to coordinate (Linux has too many architectures and many of them with inactive maintainers) One way to do that would be a "extended ioctl" syscall that supports this in a generic way (and perhaps could fix some of the other problems of ioctl too, like better type safety). Designing such a thing might end up being a rat-hole (and you would probably need to be very careful to avoid the second system effect) Of course the qdiscs and other code who uses netlink instead would also need something equivalent. Also I expect someone would come up with localization issues, although the the classical "translation database" approach would probably work anyways. -Andi -- ak(a)linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on 17 Feb 2010 05:50 > For example, have a "last error" string associated with task_struct, that: > - will clean on each syscall entry, > - while syscall is running, may be filled with printf-style routines, > - may be accessible from userspace with additional syscall [that obviously > should not reset error]? > > This will give driver writers a common interface for extended error > reporting... Thats probably overkill. For almost any ioctl type interface the only thing you *need* to make more sense is the address of the field that was deemed invalid. So in your ioctl handler you'd do something like get_user(v, &foo->wombats); if (v < 5) { error_addr(&oo->wombats); return -EINVAL; } returning text is all very well, and printk can help debug, but neither actually help application code or particularly help interpreters to dig into the detail and act themselves to fix a problem or understand it. It also costs material amounts of unswappable memory and also disk storage for the kernel image on embedded devices. Two other problems text returns bring up or ambiguity and translations - its almost impossible to keep them unique even within a big module. It's also possible to get things like typos in the returned text or mis-spellings that you then can't fix because some other app now has if (strcmp(returned_err, "No such wombat evalueted")==0) { ... } in it. (HTTP 'referer' being a dark warning from history ...) A lot of other systems keep message catalogues often indexed by module:error. Text lookups in userspace (easy to do with existing interfaces), and the OS providing generic, specific, and identifying module info. I guess the Linux extension to that would end up as extended_error(&foo->wombats, E_NOT_A_VALID_BREEDING_POPULATION); and internally expand to include THIS_MODULE and extract the module name. There's another related problem here too - Unix style errors lack the ability of some OS systems to report "It worked but ....." which leads to interface oddities like termios where it reports "Ok" but you have to inpsect the returned structure to see if you got what you requested. Doesn't look too hard to add some of this or something similar as you suggest and while it would take a long time to get coverage you have to start somewhere. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 17 Feb 2010 07:00 Hi Alan, > Thats probably overkill. For almost any ioctl type interface the only > thing you *need* to make more sense is the address of the field that was > deemed invalid. Take a look at all the return -EINVALs in net/sched/sch_cbq.c and then tell me if you really still believe just knowing the field is enough to diagnose those. A common issue for example is if it depends on the current state somehow. > actually help application code or particularly help interpreters to dig > into the detail and act themselves to fix a problem or understand it. It > also costs material amounts of unswappable memory and also disk storage > for the kernel image on embedded devices. Trading developer time for a few bytes saved is exactly the wrong tradeoff, even on a small system. In principle it could be CONFIGed of course, but I suspect it wouldn't be worth it (especially compared to all the other bloat) > > Two other problems text returns bring up or ambiguity and translations - > its almost impossible to keep them unique even within a big module. It's For translations the "pragmatic text database" works reasonably well I think. Also you don't necessarily need them to be unique (if the english string is not unique, why would the translation need to be?) Sure text won't solve all problems either, but it's infinitely better than EINVAL. > also possible to get things like typos in the returned text or > mis-spellings that you then can't fix because some other app now has > > if (strcmp(returned_err, "No such wombat evalueted")==0) { > ... > } > > in it. (HTTP 'referer' being a dark warning from history ...) You could get numbers wrong too. There's really no cure against that. But yes it's a good point -- would need to make sure that the spelling police would direct their efforts elsewhere as much as possible. > > A lot of other systems keep message catalogues often indexed by > module:error. Text lookups in userspace (easy to do with existing > interfaces), and the OS providing generic, specific, and identifying > module info. That's the IBM approach. I have some doubts it would really work for a distributed environment like Linux. I believe it has been even tried already (e.g. there's a Japanese project for such a catalog). I don't think it works that well. I think i would prefer just text strings. In principle one could still develop a convention inside them though. > > I guess the Linux extension to that would end up as > > extended_error(&foo->wombats, E_NOT_A_VALID_BREEDING_POPULATION); > > and internally expand to include THIS_MODULE and extract the module name. Hmm, yes including the module might be reasonable. > There's another related problem here too - Unix style errors lack the > ability of some OS systems to report "It worked but ....." which leads to > interface oddities like termios where it reports "Ok" but you have to > inpsect the returned structure to see if you got what you requested. I suspect that's better solved in some way specific to that call. I don't think it's all that common anyways. -Andi -- ak(a)linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dr. David Alan Gilbert on 17 Feb 2010 15:00 * Andi Kleen (andi(a)firstfloor.org) wrote: > On Wed, Feb 17, 2010 at 01:16:48PM +0300, Nikita V. Youshchenko wrote: > > > "Nikita V. Youshchenko" <yoush(a)cs.msu.su> writes: > > > > I'm developing a device driver that, in it's ioctl()s, accepts a > > > > complex data structure. Before doing it's operation, it performs large > > > > number of checks if data is valid. If one of those checks fail, driver > > > > returns -EINVAL. > > > > > > > > Unfortunately this -EINVAL is not really useful. E.g. if a developer, > > > > sitting in his IDE and debugging his code, will see ioctl() > > > > returning -EINVAL, and will have hard times finding what exactly is > > > > wrong. > > > > > > > > Before inventing driver-specific extended error reporting, I'd like to > > > > ask if there is anything more or less generic for this. > > > > I believe situation when -Exxx is too weak interface for error > > > > reporting is common. > > > > > > This is a very common problem in Linux unfortunately. I always > > > describe that as a the "ed approach to error handling". Instead > > > of giving a error message you just give ?. Just ? happens > > > to be EINVAL in Linux. > > > > > > My favourite example of this is the configuration of the networking > > > queueing disciplines, which configure complicated data structures and > > > algorithms and in many cases have tens of different error conditions > > > based on the input parameters -- and they all just report EINVAL. > > > > > > The standard way (standard kludge or standard workaround would be a > > > better description) is to use printk; often guarded by a special > > > kernel tunable or ifdef to avoid flooding the log in the normal case. > > > > > > IMHO it would be best to simply add a way to return strings directly > > > in this case (a la plan9). This would be probably not too hard to > > > implement. It's not there unfortunately. > > > > > > This could be done with one of the message oriented protocols, > > > e.g. netlink or read/write on a special minor. > > > > Why not create a generic solution for this, if one does not exist yet? > > Someone would need to do it. Yes I think it would be a worthy project. > > The trick is also get around the objections of the "but we always > did it this way" Unix traditionalists. I'd wondered about some form of halfway house where the error value is expanded but could be truncated for compatibility - i.e. if at the moment we had: return -EINVAL; it would become: return ERRORNUM(EINVAL, BADLENGTH); and that would expand to something like: return -(EINVAL + BADLENGTH << ESHIFT); existing syscall handlers could mask the extended error bits out on the way back, and a new entry could pass the whole error value back where user space could separate out the other part of the error. This still feels quite like stretching the traditional way; but at the cost of it still having the same problems (e.g. having to define a list of error values). One hard problem is that often the thing that actually returns the error has actually just got a failure from something that called it which didn't return any diagnostics, so to do this properly errors have to be passed around in a lot of places; you'll also have to figure out just how far down you want to pass it - if a read() fails due to a SCSI error there is a whole load of different levels of information that you have to chose what to return. <snip> Dave (who has stared at mmap's that have returned EINVAL for way too long) -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \ \ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex / \ _________________________|_____ http://www.treblig.org |_______/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 17 Feb 2010 15:30
> I'd wondered about some form of halfway house where the error > value is expanded but could be truncated for compatibility - i.e. Who would do the truncation? > if at the moment we had: > > return -EINVAL; > > it would become: > > return ERRORNUM(EINVAL, BADLENGTH); x86 only has about 12 bits in the current ABI btw. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |