From: Isliguezze on 19 Nov 2009 13:44 A classic string hashing algorithm can be simply described by the following code (C++ like pseudo code) int64 hash(string s) { int64 hash = 5381; for (i = 0; i < s.length; i++) hash = (hash * 33) XOR s[i]; return hash; } Is there any known approach to "unhash" a string (consisting no more than of 20 elements) if its hash code ran out of bounds of 2<sup>64</ sup> and was "cut". Thanks in advance.
From: Tim Little on 19 Nov 2009 18:34 On 2009-11-19, Isliguezze <isliguezze(a)gmail.com> wrote: > Is there any known approach to "unhash" a string (consisting no more > than of 20 elements) if its hash code ran out of bounds of 2<sup>64</ > sup> and was "cut". There are only 2^64 possible hash values, but about 2^160 possible original strings. The hash value can only tell you which set of about 2^96 strings the original string came from. That said, you may have some information about the original string that would allow you to distinguish it from the other 2^96 or so theoretical possibilities. For example, it may be much more likely to be natural-language text, or usually shorter than 20 characters. This hash algorithm is cryptographically very weak. For example, multiplication by 33 does nothing at all to the bits 0-4 of the hash, and so these bits are all just the XOR of the corresponding bits in the input (and the initial hash value). The upper bits depend only on the earlier characters. There are also plenty of more subtle weaknesses. If the string were known to be a fragment of natural language text, it might be totally reconstructed. Otherwise it would probably not be possible, but you could quite rapidly construct strings with the same hash. - Tim
|
Pages: 1 Prev: Coins on chess-board puzzle Next: Math/CompSci Interview Question - Thoughts? |