what is probability to create two equal hashes for md5 algorithm [Cryptography]

Prev: cracking DES ciphertext with known plaintext using custom character set.
Next: Large prime for easily finding discreet logs

From: Carlos Moreno on 6 Dec 2006 21:34

Mike Amling wrote:

>> MD5 and SHA-1 are functions --- it makes sense to talk about the
>> probability of collision for two randomly generated input messages.
>> If these two input messages are independent instances of a uniformly
>> distributed random variable, then the probabilities of collision are
>> 2^64 and 2^80 in the ideal case
>
> Those are pretty high probabilities.

D'oh !!! Try they're *illegal* probability values (fundamental
probability axioms establish that 0 <= P{E} <= 1 for every possible
random event E).

> I presume you intended to type
> 2^-64 and 2^-80, but the in the scenario you describe, the probability
> of a pair of random inputs producing the same output, the probabilities
> are 2^-128 and 2^-160.

Errr... D'oh! again!! :-(

You're absolutely right --- I got the numbers for two related, yet
completely different problems, mixed up; how many randomly generated
input strings to have a probability of 0.5 that there will be a
collision in those, vs. the probability of a collision in a single
pair.

> You can expect approximately one collision among 2^64 (resp. 2^80)
> different inputs, because that many messages generates 2^128 (resp.
> 2^160) pairs.

Now my turn to nitpick!! :-)

They generate 2^127 (resp. 2^159) pairs (actually, approximately;
the exact number should be 2^N * (2^N + 1) / 2 (because we're
talking unordered pairs --- the pair (M2, M1) does not count if we
already counted the pair (M1, M2), given that the order does not
make any difference to whether or not there's a collision.

But that's ok, because we want the number of inputs to have a *0.5*
probability of collision!

Carlos
--

From: Joseph Ashwood on 7 Dec 2006 00:14

"Sergei" <silentser(a)gmail.com> wrote in message
news:1165441615.135737.121220(a)f1g2000cwa.googlegroups.com...

> But if SHA-1 is a uniformly distributed function, how the Chinese guys
> could claim that it is possible to find a collision in 2^63 hash
> operations?

I'll show how by using something we can prove if uniformly distributed, but
is cryptographically weak enough for the process to be seen; modular
division by 2^64, let's call it F(x). It is trivial to prove that the output
of F is uniformly distributed across the integers [0,2^64-1]. However, we
can also trivially compute x's that collide by simply adding 2^64 to the
previous x. F is perfectly uniform across all integers x, but is trivially
weak.

Uniform distribution among the output set is not enough for cryptographic
strength.
Joe

From: Sergei on 7 Dec 2006 04:55

Joseph Ashwood wrote:
> "Sergei" <silentser(a)gmail.com> wrote in message
> news:1165441615.135737.121220(a)f1g2000cwa.googlegroups.com...
>
> > But if SHA-1 is a uniformly distributed function, how the Chinese guys
> > could claim that it is possible to find a collision in 2^63 hash
> > operations?
>
> I'll show how by using something we can prove if uniformly distributed, but
> is cryptographically weak enough for the process to be seen; modular
> division by 2^64, let's call it F(x). It is trivial to prove that the output
> of F is uniformly distributed across the integers [0,2^64-1]. However, we
> can also trivially compute x's that collide by simply adding 2^64 to the
> previous x. F is perfectly uniform across all integers x, but is trivially
> weak.
>
> Uniform distribution among the output set is not enough for cryptographic
> strength.
> Joe

Sorry for another dilettantish question, but what is "a uniformly
distributed function"? I thought this is simply another name for a
random oracle function. If this is a function that simpy gives
uniformly distributed outputs when given a uniformly distributed inputs
then, for example, F(x)=x is also uniformly distributed. But what does
it have to do with the collision probabilities in the case described
here?

Sergei

From: Volker Hetzer on 7 Dec 2006 08:57

Sergei schrieb:
> But if SHA-1 is a uniformly distributed function, how the Chinese guys
> could claim that it is possible to find a collision in 2^63 hash
> operations?
Because it's pseudorandom only. This has nothing to do with the collision
probability of two randomly chosen sets of data.

Lots of Greetings!
Volker
--
For email replies, please substitute the obvious.

From: Sergei on 7 Dec 2006 09:43

Volker Hetzer wrote:
> Sergei schrieb:
> > But if SHA-1 is a uniformly distributed function, how the Chinese guys
> > could claim that it is possible to find a collision in 2^63 hash
> > operations?
> Because it's pseudorandom only. This has nothing to do with the collision
> probability of two randomly chosen sets of data.
>
> Lots of Greetings!
> Volker
> --
> For email replies, please substitute the obvious.

Sure. But what random data has to do with "few doesns of __terabytes__
of data", which should be partishioned in blocks and hashed? As I
understood, all the estimations here were made by assuming that the
data was random. But why do you think this "few doesns of __terabytes__
of data" contain a random data?

Sergei

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: cracking DES ciphertext with known plaintext using custom character set.
Next: Large prime for easily finding discreet logs