From: Simon Johnson on
On Jan 20, 8:42 pm, Peter Fairbrother <zenadsl6...(a)zen.co.uk> wrote:
> In general there is the idea that random permutations, or sequences, are
> not compressible.
>

What is the shortest program that can produce sequence n? This seems
like a straight-forward enough question: if the program is shorter
than simply listing the elements of the sequence, then we've achieved
some compression.

However, even after we've defined what language it is and what
environment that language runs in, it is impossible to prove that the
program is in fact the shortest even though, logically, a shortest
program must exist. [1]

Many permutations will have a program that generates them that is
smaller than just listing the sequence. The problem is finding them is
extremely expensive.

A related question: are there sequences of integers for which *no*
program exists that can generate them? Again, the answer is yes. As
you might expect, there is a deep connection between this sequence and
the halting problem: it's called the busy beaver sequence.[2]

The interesting thing about this specific sequence is that if we can
determine the busy beaver value for a large enough machine, then we
could in fact prove many theorems by brute-force alone. [3]

For example, you could prove the Collatz Conjecture [4] by simply
running a program that counts up through each integer, proving that it
does tend to 1, and stop it when it does more steps than the busy
beaver did for that machine. If it does, it's guaranteed to run
forever which proves the theorem.

Cheers,

Simon

[1] - http://www.flownet.com/gat/chaitin.html
[2] - http://en.wikipedia.org/wiki/Busy_beaver
[3] - Of course, running the brute-force attempt would take more
energy than exists in the universe, since the number of steps would be
huge.
[4] - http://en.wikipedia.org/wiki/Collatz_conjecture



From: unruh on
On 2010-01-21, Peter Fairbrother <zenadsl6186(a)zen.co.uk> wrote:
> David Eather wrote:
>> unruh wrote:
>>> On 2010-01-21, David Eather <eather(a)tpg.com.au> wrote:
>>>> Peter Fairbrother wrote:
>>>>> In general there is the idea that random permutations, or sequences,
>>>>> are not compressible.
>>>> Random permutations are always compressible.
>>>
>>> Since there are fewer numbers less than the size, and compression means
>>> a one to one mapping, random permutations are not compressible in
>>> general. There are not enough compressed outputs.
>>
>> I am assuming a permutation with a defined range eg elements in the
>> range of 0 - 255 etc. They are compressible unless I have missed
>> something about the context in the thread (not impossible)
>
>
> You are correct, and it's my bad for not being more precise.
>
> I wasn't talking about the compressibility of a permutation to a string
> of numbers (as eg the last entry is uniquely fixed, the second-to-last
> entry is a choice of two etc), what I was talking about was the
> compressibility of a string of random numbers.

If you have a string of L bits, you have 2^L random numbers. to compress
that you have to map those 2^L numbers into strings of length less than
L. But there are far fewer strings of length less than L than there are
of strings of length L. Thus you cannot in general compress them. You
can compress a few of them ( eg you can compress 2^N of those strings to
compressed strings of length N but if N<L then 2^N<<2^L Ie most strings
cannot be compressed. If you choose a random string, then it will most
probably be one that cannot be compressed, since they are more numerous.
Note that if you use various compressors, some of the "compressed
string" also needs to be some identifier of which compressor you used.
That identifier has to be a part of the compressed string or else you
cannot uncompress it, and it is a hash not a compression. The length
that that identifier adds precisely ensures that the compressed string
is on average as long as the uncompressed string.


>
> But I was also talking about permutations, and confused the issue. Sorry.
>
> -- Peter Fairbrother
From: unruh on
On 2010-01-21, Simon Johnson <simon.johnson(a)gmail.com> wrote:
> On Jan 20, 8:42?pm, Peter Fairbrother <zenadsl6...(a)zen.co.uk> wrote:
>> In general there is the idea that random permutations, or sequences, are
>> not compressible.
>>
>
> What is the shortest program that can produce sequence n? This seems
> like a straight-forward enough question: if the program is shorter
> than simply listing the elements of the sequence, then we've achieved
> some compression.
>
> However, even after we've defined what language it is and what
> environment that language runs in, it is impossible to prove that the
> program is in fact the shortest even though, logically, a shortest
> program must exist. [1]

It is "easy" to find. Just run each of the finite number of shorter
programs and see if it produces the string. While a bit tedious, this is
defines the shortest program.
>
> Many permutations will have a program that generates them that is
> smaller than just listing the sequence. The problem is finding them is

But most will not. The counting argument shows that most have a longer
program.


> extremely expensive.
>
>
>
>
From: Greg Rose on
In article <TaY5n.7308$ZN1.5050(a)newsfe05.iad>,
Joseph Ashwood <ashwood(a)msn.com> wrote:
>"Peter Fairbrother" <zenadsl6186(a)zen.co.uk> wrote in message
>news:4b576ab4$0$2532$da0feed9(a)news.zen.co.uk...
>> In general there is the idea that random permutations, or sequences, are
>> not compressible.
>
>Actually permutations are compressible. Labeling each permutation with an
>integer, significantly abbreviated the permutation. As an example take AES,
>the key can be trivially converted to an integer, and serves as such an
>index. Using that index as the key to AES in CTR mode will restore the
>permutation. For a complete permutation set of all 128-bit blocks would
>require (2^128)! as an upper limit, not a small number.

Indeed, it is not a small number, but it is significantly less than
(2^128)^(2^128), which is how many 128->128 bit mappings there are.
So yes, permutations are compressible.

Greg.
--
Greg Rose
232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C
From: Simon Johnson on
> It is "easy" to find. Just run each of the finite number of shorter
> programs and see if it produces the string. While a bit tedious, this is
> defines the shortest program.
>

You didn't read the proof attached in the footnote did you?

The point is, such a "stepper" programmer would generate the string
would it not, since it constructs and runs each program in passing? So
the string it identified is not the shortist program because the
stepper executes it in its operation.

Then you're left with the problem of what is the shortest stepper
program? Come to think of it, wouldn't the stepper program generate
*itself* while it tried to find the string. If the stepper program is
shorter than the program that generates the sequence, this definitely
must be so.

Look up Chaitin Elegance in Google for more information.

Cheers,

Simon