From: Valdis.Kletnieks on
On Fri, 13 Nov 2009 22:26:20 +0100, Julia Lawall said:
> On Fri, 13 Nov 2009, Valdis.Kletnieks(a)vt.edu wrote:
> > Julia, is there a way to use coccinelle to detect unsafe changes like that? Or
> > is expressing those semantics too difficult?
>
> Could you give a concrete example of something that would be a problem?
> If something like alias analysis is required, to know what strings a
> variable might be bound to, that might be difficult. Coccinelle works
> better when there is some concrete codeto match against.

Here's a concrete example of how a previously audited strcmp() can go bad...

struct foo {
char[16] a; /* old code allows 15 chars and 1 more for the \0 */
int b;
int c;
}

bzero(foo,sizeof(foo));

Now code can pretty safely mess with the first 15 bytes of foo->a and
we know we're OK if we call strcmp(foo->a,....) because that bzero()
nuked a[15] for us. It's safe to strncpy(foo->a,bar,15); and not worry
about the fact that if bar is 15 chars long, a trailing \0 won't be put in.

Now somebody comes along and does:

struct foo {
char *a; /* we need more than 15 chars for some oddball hardware */
int b;
int c;
}

bzero(foo,sizeof(foo));
foo->a = kmalloc(32); /* whoops should have been kzmalloc */

Now suddenly, strncpy(foo->a,bar,31); *isn't* safe....

(Yes, I know there's plenty of blame to go around in this example - the failure
to use kzmalloc, the use of strncpy() without an explicit \0 being assigned
someplace, the use of strcmp() rather than strncmp()... But our tendency to
intentionally omit several steps of this to produce more efficient code means
it's easier to shoot ourselves in the foot...)

From: David Wagner on
> The biggest problem with strcmp() is that even if it got audited when
> that code went in, it's prone to unaudited breakage when somebody changes
> something in some other piece of code, quite often in some other .c file
> in some other directory.

I don't understand what concern you are ferring to. Could you explain?
What is special about strcmp() that requires auditing? What kind of
breakage are you talking about?

Are you just referring to the fact that strcmp() assumes its strings
are '\0'-terminated? Do you have the same concern about every library
function that handles '\0'-terminated strings? Does your concern apply
to this particular code snippet, where the call is (or would be) of the
form strcmp(s, "string constant")? Does your concern apply equally to
strncmp(s, "string constant", sizeof("string constant"))?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Wagner on
>Here's a concrete example of how a previously audited strcmp() can go bad...
>
>struct foo {
> char[16] a; /* old code allows 15 chars and 1 more for the \0 */
> int b;
> int c;
>}
>
>bzero(foo,sizeof(foo));
>
>Now code can pretty safely mess with the first 15 bytes of foo->a and
>we know we're OK if we call strcmp(foo->a,....) because that bzero()
>nuked a[15] for us. It's safe to strncpy(foo->a,bar,15); and not worry
>about the fact that if bar is 15 chars long, a trailing \0 won't be put in.
>
>Now somebody comes along and does:
>
>struct foo {
> char *a; /* we need more than 15 chars for some oddball hardware */
> int b;
> int c;
>}
>
>bzero(foo,sizeof(foo));
>foo->a = kmalloc(32); /* whoops should have been kzmalloc */
>
>Now suddenly, strncpy(foo->a,bar,31); *isn't* safe....
>
>(Yes, I know there's plenty of blame to go around in this example - the failure
>to use kzmalloc, the use of strncpy() without an explicit \0 being assigned
>someplace, the use of strcmp() rather than strncmp()... But our tendency to
>intentionally omit several steps of this to produce more efficient code means
>it's easier to shoot ourselves in the foot...)

I'm not persuaded by your arguments against strcmp(): not even
a little bit.

1) The particular code snippets under discussion here were of the
form
strncmp(foo, "constant", sizeof("constant"))
And the proposal is to replace this code with
strcmp(foo, "constant")
Do you really mean to object to this proposal?

Please note that the '\0'-termination issues you mention do not
come up when comparing to a string constant. strcmp(, "constant")
is not going to run past the end of your 16-byte buffer. Moreover,
even if there was an issue with '\0'-termination, it would be present
to exactly the same degree with strncmp(), since the two code snippets
are behaviorally identical (in this case, it is not an issue for either
one, but if there was some context where it was an issue for strcmp(),
it would be an issue for the strncmp() equivalent, too). So none
of your arguments are a good reason to prefer
strncmp(foo, "constant", sizeof("constant"))
to
strcmp(foo, "constant")

2) More generally, you seem to be concerned about bugs where one
piece of code fails to '\0'-terminate a string, and another piece of
code invokes some library function that relies upon '\0'-termination.
That is not specifically a strcmp() issue; this is an issue with using
any string library that assumes '\0'-termination, where the string is not
'\0'-terminated. That's a much broader issue that should be addressed by
other means. Saying "strcmp() is bad" is a poor response to this risk.
For every string, you should decide whether it must be '\0'-terminated
or not, and then act appropriately.

If the string is intended to be '\0'-terminated, then you have
an obligation to ensure that all code consistently maintains this
invariant, and in return you may call string libraries that assume
'\0'-termination. Or if the string isn't intended to be '\0'-terminated,
then you should never be calling any string library function that
assumes '\0'-termination. This is pretty elementary stuff: if you do
much C programming, you're hopefully used to this. There's nothing
wrong with using strcmp(), if it is used appropriately. A kneejerk
"never use strcmp()" seems like poor policy to me.

In summary, I think your arguments against strcmp() are unpersuasive in
this context. Perhaps in some other context they are relevant, but not
to this thread. Focusing on strcmp() is misplaced.

If you want to do something useful, I think it would be more useful
to focus on checking that strings are properly '\0'-terminated where
this is necessary -- but realize that this is a much more challenging
property, and it's going to require something much more sophisticated
than just "don't use strcmp()". Most likely, this is not something that
Cocinelle will be sophisticated enough to handle usefully, at least in
the general case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Joe Perches on
On Sat, 2009-11-14 at 03:44 +0000, David Wagner wrote:
> I personally don't find
> strncmp(foo, "constant", sizeof("constant")) // first snippet
> to be more readable, auditable, or obviously correct than
> strcmp(foo, "constant"). // second snippet
> Is there a technical basis for arguing that the first
> snippet is better than the second snippet?

I don't think there is.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Valdis.Kletnieks on
On Sat, 14 Nov 2009 00:41:15 GMT, David Wagner said:

> 1) The particular code snippets under discussion here were of the
> form
> strncmp(foo, "constant", sizeof("constant"))
> And the proposal is to replace this code with
> strcmp(foo, "constant")
> Do you really mean to object to this proposal?

No, in that particular case strcmp() is Obviously Safe.

> 2) More generally, you seem to be concerned about bugs where one
> piece of code fails to '\0'-terminate a string, and another piece of
> code invokes some library function that relies upon '\0'-termination.

Exactly.

> That is not specifically a strcmp() issue; this is an issue with using
> any string library that assumes '\0'-termination, where the string is not
> '\0'-terminated. That's a much broader issue that should be addressed by
> other means.

I know that - which is why I asked Julia if Cocinelle is able to do anything
in this area. An awful lot of our \0-terminated strings end up that way
implicitly because somebody does a kzalloc() or bzero() of an entire
structure, which can be fragile if code is refactored.

By the same token, that implicit behavior means that it's probably quite
difficult for any programmatic correctness checkers to follow the behavior.

> Saying "strcmp() is bad" is a poor response to this risk.

I didn't say "strcmp() is bad". I said it needs auditing. The strn-
versions of functions have a guaranteed termination condition right there
in the call. For strcmp() and strcpy(), the termination guarantee is
often elsewhere, which is why code using them tends to be brittle and
is harder to audit.