Salsa20 hashing [Cryptography]

Prev: How long to trust 3DES
Next: M.peg biss key finder

From: xmath on 28 Sep 2006 03:43

Scott Contini wrote:
> Just a few comments: According to my colleague who knows all this
> foundations of cryptographry stuff, there is no known construction
> which converts a one-way function into a collision resistant hash function.

I know the underlying function will definitely more than one-wayness.
But Salsa20 with appropriately chosen diagonal constants isn't just
one-way but also collision-resistant (indeed, likely to be
collision-free as it maps 48-byte strings to 64-byte strings). It also
doesn't have any fixed point or other deviations from randomness I'm
aware of. I also probably need that for given w it must be hard to
find m and m' such that H(m) - H(m') = w, which is another thing I
think I'd trust Salsa20 for.

So, my current idea is basically splitting the message m into k =
ceil(len/48) blocks of 48 bytes each (last block zero-padded, len in
bytes < 2^52), and saying:

Hash(m) = IV !+ H(48, m_1) !+ H(96, m_2) !+ ... !+ H(len, m_k)

Here H(len, m_i) is Salsa20 on an input which puts the two-word len
along with two fixed fixed words on the diagonal and m_i into the
remaining 12 words, and !+ is an operator which I construct to be
non-commutative and non-associative: a !+ b = F(a) + b for some
function F. This to avoid anything like the attack in David Wagner's
paper "A Generalized Birthday Problem".

As long as F is bijective and doesn't harmfully interact with H this is
obviously no worse than simply summing, which means some analysis done
on such schemes can be reused. This leaves the question, does anyone
know of collision attacks on the scheme above other than the one in
that paper?

Another question is how to choose F such that:
1. it's as simple as possible
2. it's bijective (to avoid losing information in the left operand)
3. it prevents (2k)-sum collision attacks
4. it doesn't harmfully interact with H like the choice F=H does

While Salsa20/8 without final round addition is simple and probably
does a great job at avoiding any "nice" properties in !+ there is still
some worry on point 4 which makes me feel I should probably construct F
in a way that's totally unrelated to Salsa20.

Any suggestions?

(I realize that the result would still be "heuristic", but ultimately
all current collision-resistant functions are)

- xmath

From: xmath on 28 Sep 2006 06:38

(since my message hasn't shown up hours after posting it, I'm assuming
it got lost somehow so I'm posting one with roughly the same content...
sorry if it ends up being a dup)

Scott Contini wrote:
> According to my colleague who knows all this foundations of cryptographry
> stuff, there is no known construction which converts a one-way function
> into a collision resistant hash function.

Salsa20 isn't just one-way but also collision-resistant. When using it
as 48-to-64-byte hash with appropriately chosen diagonal constants I
also know of no fixed point or other deviation from a random function.

Here is the most up-to-date version of my idea:

Let m be an l-byte message (l < 2^52) and n = ceil(l/48), split it into
n 48-byte blocks, m_1 .. m_n, with the last one zero-padded if
necessary. Let l_i = i * 48 for i < n and l_n = l.

Define H(l_i, m_i) as Salsa20 applied to an input whose diagonal
consists of two appropriately chosen constant words and two words
encoding l_i, and whose remaining 12 words are filled by m_i.

Define the left-assoc operator a !+ b as F(a) + b for some
appropriately chosen bijective function F (discussed later), where +
adds all 16 words elementwise (mod 2^32).

Now: Hash(m) = IV !+ H(l_1, m_1) !+ H(l_2, m_2) !+ .. !+ H(l_n, m_n)

The first thing to notice is that if H were a random oracle, no choice
of F would obviously result in a *worse* hash than picking the identity
function for F. This means that the only "harmful" choices for F are
those which badly interact with the specific choice of H. Choosing F
to be Salsa20 (ignoring that it's not bijective) is such an example.

Now using the identity function for F makes this similar to AdHash, and
opens it up to the attack in David Wagner's "A Generalized Birthday
Problem". The attack in that paper doesn't work if !+ is sufficiently
inconvenient, so that's the open issue... What's a good choice of F
such that:

1. F is bijective (to not discard information from !+'s left argument)
2. F is simple and convenient to implement
3. generalized birthday attacks are thwarted
4. F doesn't harmfully interact with H

I'm pretty sure that my most recent pick of using Salsa20/8 without
final addition as F satisfies the first three criteria, however there's
still some worry left that it might interact with H. Perhaps I should
pick F to be something quite different from Salsa20, though preferably
still working on the matrix in a similar way, to avoid messing up the
pipeline in vectorized implementations.

Any suggestions on that?

- xmath

From: D. J. Bernstein on 28 Sep 2006 16:46

xmath <xmath.news(a)gmail.com> wrote:
> But Salsa20 with appropriately chosen diagonal constants isn't just
> one-way but also collision-resistant (indeed, likely to be
> collision-free as it maps 48-byte strings to 64-byte strings).

There are 2^384 outputs, so there are about 2^767 pairs of outputs, out
of which one would expect about 2^767/2^512 = 2^255 colliding pairs.
Finding those colliding pairs does seem difficult, thanks to the
diagonal constants; but the function doesn't compress, so there's no
apparent application of this difficulty.

To summarize: Collision-freeness and collision-resistance are useless
concepts for expansion functions such as Salsa20. Collision-freeness is
too strong to be satisfied, and collision-resistance is too weak to have
any helpful consequences.

> Hash(m) = IV !+ H(48, m_1) !+ H(96, m_2) !+ ... !+ H(len, m_k)

You should apply a final Salsa20 to the 512-bit output to eliminate
extension attacks, to allow safe truncation, etc.

> Here H(len, m_i) is Salsa20 on an input which puts the two-word len
> along with two fixed fixed words on the diagonal and m_i into the
> remaining 12 words, and !+ is an operator which I construct to be
> non-commutative and non-associative: a !+ b = F(a) + b for some
> function F. This to avoid anything like the attack in David Wagner's
> paper "A Generalized Birthday Problem".

The second-order attack applies no matter what F is. Wagner

* writes down 2^172 choices of F(a);
* sorts by the bottom 172 bits, finding about 2^171 pairs a,a' with
F(a) - F(a') mod 2^172 = 0;
* sorts the list of F(a) - F(a') for those pairs;
* writes down 2^172 choices of b;
* sorts by the bottom 172 bits, finding about 2^171 pairs b,b' with
b' - b mod 2^172 = 0;
* sorts the list of b' - b for those pairs; and
* merges the lists of differences, finding a few collisions
F(a) - F(a') = b' - b, i.e., F(a) + b = F(a') + b'.

The number of bit operations here is on the scale of 2^172, certainly
far smaller than 2^256.

On the other hand, the communication costs here are huge, making the
attack no more cost-effective than a standard parallel collision search.
Furthermore, on an absolute scale, 2^172 is a ludicrous number of bit
operations for the attacker, certainly not something that concerns us.

---D. J. Bernstein, Professor, Mathematics, Statistics,
and Computer Science, University of Illinois at Chicago

From: xmath on 29 Sep 2006 03:54

D. J. Bernstein wrote:
> There are 2^384 outputs, so there are about 2^767 pairs of outputs, out
> of which one would expect about 2^767/2^512 = 2^255 colliding pairs.

Yea, not long after I posted it I realized collision-freeness was
actually extremely unlikely, hence I omitted that remark from the
second version of my post. Dunno what I was thinking initially :-)

> collision-resistance is too weak to have any helpful consequences.

Well, even though it doesn't compress, collisions probably do exist as
you pointed out, and they must be hard to find or you'd have a
collision in the overall hash. So I wouldn't call it entirely
unhelpful, but it's also not enough.

> You should apply a final Salsa20 to the 512-bit output to eliminate
> extension attacks, to allow safe truncation, etc.

Good call! I kind of forgot about it, originally having F(.. + H(..))
where truncation is safe instead of F(..) + H(..) where it isn't.
Dunno why I ever switched though, so I'll switch back. (The chain ends
up being the same but with a final application of F() instead of a
useless initial one on the IV)

As for extension, I don't really care about it. With F being
invertible a final application of it isn't enough obviously, though if
you also truncate to 256 bits or so, extension becomes very hard.

> The number of bit operations here is on the scale of 2^172, certainly
> far smaller than 2^256.
>
> On the other hand, the communication costs here are huge, making the
> attack no more cost-effective than a standard parallel collision search.
> Furthermore, on an absolute scale, 2^172 is a ludicrous number of bit
> operations for the attacker, certainly not something that concerns us.

Right :-)

I'll end up truncating to 256 bits anyway, so attacks slower than 2^128
don't worry me.

Thanks for all the feedback so far.

- xmath

First | Prev |
Pages: 1 2 3
Prev: How long to trust 3DES
Next: M.peg biss key finder