From: Thomas Pornin on
According to Paul Rubin <no.email(a)nospam.invalid>:
> Yes, I'm thinking of large files. The question came up because
> I just downloaded a 30GB Wikipedia dump and checking the md5 took
> a while.

Chances are that MD5 was only a small part of the delay. A "normal" PC
such as mine (Intel Q6600, 2.4 GHz) can hash data with MD5 at more than
400 MB/s, with a single core. Even assuming two disks in RAID1, disk
reading bandwidth is not over 200 MB/s, so half a core is sufficient;
that's onlt 1/8th of the machine CPU power. Also, in a well-ordered data
process path, hashing would have been performed as part of the download,
and even a gigabit link remains below 100 MB/s.

In other words, _if_ MD5 was a significant part of the delay, _then_
I am jealous, because it means that your machine has at least 30 GB
of RAM.


I am not saying that performance is unimportant; only that it takes a
very specific situation, on a PC, for performance of a hash function to
become significant. Unless you use a very slow hash function (e.g.
Whirlpool), most hash functions will be more than fast enough. The
"slow" SHA-256 still runs at more than 140 MB/s on my PC.

However, on smaller architectures, things change quite a bit. I have a
Linksys router, which is quite typical of cheap routers and WiFi access
points, in that it uses as main CPU a Mips-derivative, clocked at 200 MHz.
This is not a superscalar architecture, it cannot perform more than one
instruction per clock cycle. And there is one core only. On such a
system, MD5 runs at 10 MB/s, barely enough to follow the bandwidth of a
mere 100baseT ethernet link. SHA-1 (5.8 MB/s) and SHA-256 (2.7 MB/s) are
"slow" and would become bottlenecks in your big-Wikipedia-download
scenario.


As for your question: in a typical signature scenario, resistance to
collisions is unimportant. Collisions with signatures are a problem only
if the signer accepts data from the attacker, and most of the problems
are avoided if the signer injects some alea at the beginning (e.g. with
X.509 certificates, CA which use random "serial numbers" are not
attacked). If you just need (2nd-)preimage resistance then it seems
plausible that you can have something faster than a usual hash function.
As an approximation, use MD4, which is quite faster than MD5 (14 MB/s on
my router, 680 MB/s on my PC).


--Thomas Pornin
From: Kristian Gj�steen on
Greg Rose <ggr(a)nope.ucsd.edu> wrote:
>In article <hor1ns$a9b$1(a)orkan.itea.ntnu.no>,
>Kristian Gj�steen <kristiag+news(a)math.ntnu.no> wrote:
>>Paul Rubin <no.email(a)nospam.invalid> wrote:
>>>Is there a way to digitally sign a large file without using an SHA-like
>>>hash function?
>>
>>I would not be surprised if it is possible to create some sort of digital
>>signature scheme that does not essentially contain a collision resistant
>>hash function.
>
>It's rare for me to disagree with Kristian, but I
>think collision resistance is pretty much the most
>important requirement for a hash function in a
>digital signature scheme. If collisions are easy,
>producing alternative signed documents is also
>easy.

I don't disagree with you, though. :-) I believe I was (as usual)
clumsy in my writing.

For all of our current schemes (which are all of the form "hash the
message, then apply some number theory to the hash"), we usually need
collision resistance.

But consider an alternative RSA-FDH scheme, randomized RSA-FDH. Instead
of hashing the message, we hash the message and some randomness, then
apply the RSA signing key to the hash. The signature is the randomness
and the e'th root.

We can now use a hash function that is not collision resistant, it won't
be a problem.

More speculatively, one can imagine a signature scheme where there is
no (explicit) hash function, and therefore no collision resistant hash
function.

--
Kristian Gj�steen
From: unruh on
On 2010-03-30, Kristian Gj?steen <kristiag+news(a)math.ntnu.no> wrote:
> Greg Rose <ggr(a)nope.ucsd.edu> wrote:
>>In article <hor1ns$a9b$1(a)orkan.itea.ntnu.no>,
>>Kristian Gj?steen <kristiag+news(a)math.ntnu.no> wrote:
>>>Paul Rubin <no.email(a)nospam.invalid> wrote:
>>>>Is there a way to digitally sign a large file without using an SHA-like
>>>>hash function?
>>>
>>>I would not be surprised if it is possible to create some sort of digital
>>>signature scheme that does not essentially contain a collision resistant
>>>hash function.
>>
>>It's rare for me to disagree with Kristian, but I
>>think collision resistance is pretty much the most
>>important requirement for a hash function in a
>>digital signature scheme. If collisions are easy,
>>producing alternative signed documents is also
>>easy.
>
> I don't disagree with you, though. :-) I believe I was (as usual)
> clumsy in my writing.
>
> For all of our current schemes (which are all of the form "hash the
> message, then apply some number theory to the hash"), we usually need
> collision resistance.
>
> But consider an alternative RSA-FDH scheme, randomized RSA-FDH. Instead
> of hashing the message, we hash the message and some randomness, then
> apply the RSA signing key to the hash. The signature is the randomness
> and the e'th root.
>
> We can now use a hash function that is not collision resistant, it won't
> be a problem.

One can of course encrypt the entire message with your private key. Now
only you could have done that, and thus the message as a whole is
signed. The problem is that the opposition could remove part of the
message. To get around that one could always do overlapping encryptions.
Ie, if the message is N words long, and if the RSA chunk is 1024 (64
words) long, you encrypt W_(N-31) W_(N-30)...W_N W_0 W_1..W_31 then
W_0...W_63 then W_32...W_127 etc. That way if a chunck was missing,
the ends in the included stuff would indicate the missing stuff. (If the
message had huge chuncks of the same words of course this would not
work). Anyway, this would be an authentication scheme which would not
need a hash.

I am not at all sure how your randomess would help, since you could
always claim that you had a different randomness, and disown the
signature, and noone could prove you wrong. Ie, part of a signature is
to bind the signer to the document, must just to prove the document is
unaltered.

>
> More speculatively, one can imagine a signature scheme where there is
> no (explicit) hash function, and therefore no collision resistant hash
> function.
>
From: Greg Rose on
In article <hotqpq$577$1(a)orkan.itea.ntnu.no>,
Kristian Gj�steen <kristiag+news(a)math.ntnu.no> wrote:
>Greg Rose <ggr(a)nope.ucsd.edu> wrote:
>>In article <hor1ns$a9b$1(a)orkan.itea.ntnu.no>,
>>Kristian Gj�steen <kristiag+news(a)math.ntnu.no> wrote:
>>>Paul Rubin <no.email(a)nospam.invalid> wrote:
>>>>Is there a way to digitally sign a large file without using an SHA-like
>>>>hash function?
>>>
>>>I would not be surprised if it is possible to create some sort of digital
>>>signature scheme that does not essentially contain a collision resistant
>>>hash function.
>>
>>It's rare for me to disagree with Kristian, but I
>>think collision resistance is pretty much the most
>>important requirement for a hash function in a
>>digital signature scheme. If collisions are easy,
>>producing alternative signed documents is also
>>easy.
>
>I don't disagree with you, though. :-) I believe I was (as usual)
>clumsy in my writing.
>
>For all of our current schemes (which are all of the form "hash the
>message, then apply some number theory to the hash"), we usually need
>collision resistance.
>
>But consider an alternative RSA-FDH scheme, randomized RSA-FDH. Instead
>of hashing the message, we hash the message and some randomness, then
>apply the RSA signing key to the hash. The signature is the randomness
>and the e'th root.
>
>We can now use a hash function that is not collision resistant, it won't
>be a problem.

Ah, yes, the randomness (assuming it is not
somehow under the control of the attacker) moves
the requirement from collision resistance to
2nd-preimage restistance. Now I understand what I
think you meant. :-)

>More speculatively, one can imagine a signature scheme where there is
>no (explicit) hash function, and therefore no collision resistant hash
>function.

But coming full circle, back to efficiency
arguments, the public key operation must always
apply to some smallish amount of data, and if you
have more data than that, you either need lots of
PK operations (ugh!) or something that looks like
a hash function, smells like a hash function, and
quacks like a hash function, no?

So, yeah, I can imagine such a signature scheme,
but I can also imagine not wanting to use it to
verify a boot image on a mobile device...

Greg.
--
From: Mehdi Tibouchi on
Greg Rose wrote in message <houfru$2r9$1(a)ihnp4.ucsd.edu>:
>
> Ah, yes, the randomness (assuming it is not
> somehow under the control of the attacker) moves
> the requirement from collision resistance to
> 2nd-preimage restistance. Now I understand what I
> think you meant. :-)

Not if you want the signature scheme to be secure against existential
forgeries: if you find a collision in the hash function of a PFDH scheme,
there's an obvious existential forgery with one signature query.