Topological identicalness of computer programs [Theory]

Prev: Call for Papers Reminder (extended): World Congress on Engineering and Computer Science WCECS 2010
Next: Call for papers: WSOM 2011, 8th Workshop on Self-Organizing Maps

From: Heelium on 27 Jun 2010 22:55

Topological identicalness of computer programs

The following is a theory behind my 1D cellular automata solver.

We can order the possible states of any program into finite or
infinite (on Turing machine) set. With combinatorics, we can create
supersets, which contain all possible sets of these states so that no
two sets contain the same element.

This means:

Let's call the set of all program states S.

We can create a set U = set of sets X, where each element of set X is
subset of elements in S such that no two elements of X overlap. None
of those sets we are looking at are empty sets.

Now, for each state in set S there is a state, which will follow it
when program is running - let's call this connection Produces(X, Y),
where X and Y are members of S and X will produce Y when program is
running.

Let's state that G is any member of set U.

In such case G contains sets of members of S, each containing one or
more element. When all elements of G_x produce some element of G_y
(Produces(G_x, G_y) for each combination of elements in G_x and G_y),
then we call it Produces(G_x, G_y).

In U, there is also a member, which contains all it's states - this G
follows the rule that Produces(G_x, G_x), each state of it produces
itself.

Let's call G an abstraction of a state machine - it is one aspect of
it.

I bring an example, where A=>B is shortcut for Produces(A, B):

There is a state machine A such that:
States: A, B, C
Rules: A=>C, B=>C, C=>A

Then there is another state machine A such that:
States: D, E
Rules: D=>E, E=>D

Now, if we make such abstraction of first state machine:
States: (A, B), C
Then we get the following rules:
Rules: (A, B)=>C, C=>(A, B)
C does not produce B, but it produces a group of (A, B).

So, we can say that there is such abstraction of first state machine,
which is topologically identical to second state machine.

Now, we have a formal rules to do abstractions of state machines. As
there are abstractions, there also come the communication channels -
we can abstract a state machine into two state machines, which
communicate with each other.

For infinite state machines, we must consider that all states are
countable - their index is the order of their first appearance;
secondly, all interesting sets of states are countable - for example,
by possible iterator programs. State sets not containing any order we
would be able to express on Turing machine are not interesting,
because they could not make any sense on machine we could build
algorithmically (on Turing machine). Infinite state machines contain
sets G such that there is always finite number of state sets and such
at least one of them must contain infinite number of states in case
the machine will grow indefinitely.

By abstracting infinite state machines, we will get finite state
machines, which we can fully analyze.

State series

Topological identicalness discussed for now could be called "strict
identicalness".

We can form another set of states, which will contain several states
following each other.

For example, if A=>B and B=>C, then we could form a combined state of
ABC.

Now, if we have such state machine:
Rules: A=>B, B=>C, C=>D, D=>A

Then we see it also implies such state machine:
Rules: ABC=>D, D=>A

By allowing such combined states (I will use this term only for
states, which produce each other in series) in our state sets, we will
get a new layer of abstractions. Elements of such sets must have their
length as second element. For simplicity, we do not allow same state
group (this is an element of some G) to contain elements of different
lengths, but it must be said that we can form strict rules also based
on that.

This will be another kind of topological identicalness and allow a new
kind of abstraction - in following text, you should be able to
understand from my use of words and context, which abstractions are
meant as I am not going to think out a bunch of words for them. This
can be easily shown, how to handle each case.

State machine abstractions combined

When we abstract a state machine, we will get several state machines,
which might interact with each other in several ways.

We can work out rules about the ways they interact with each other.
Most interesting interaction rules are restrictions to interactions,
which can only be found by looking several machines in combination.
For example, one abstract machine can be proven to not intervene into
memory requests of another or to not send any signal before it has got
a specific signal. As those interactions involve some time - one
machine can, for example, wait for signals only each second tick and
leave unnoticed any signals at other times or do different things with
signals dependent on timing.

For analyzing any abstract machine, we must make the set of rules of
interactions. For interesting abstractions, we usually find such state
group that as long as our abstract machine does not leave it, it will
also not be interrupted. If it will get signals so systematically that
they could be as well included in it's own state groups, we might have
chosen a wrong abstraction.

This is normal that in a program run, abstract machines will be
created and destroyed. We can find rules covering some specific
machines, which is often created, to specify the logic of our whole
system.

Indefinite growth of computer programs as topologically equivalent
subsequent state machines

All computer programs, which can not be proven as hanging or not
hanging by counting finite number of states and relations between
them, will stop their growth at some point in time. Theoretically, we
have full FSM analysis to cover everything knowable about such
programs (unless they are interacting with outside world). We are
interested about how to detect hanging in more general case - to
detect, in context of this text, means to prove. If we have logic for
proving, we can theoretically try all combinations of logic and when
we reach the correct one, we have proven what we need to prove.

In case computer program is growing indefinitely, it must satisfy the
following condition:

There is abstract state machine, which produces an identical abstract
state machine with context, which is identical in some aspects.

Those aspects, specifically, are the following:
1. Context of such state machine must not intervene with it's normal
functioning. It means, we must prove that context will not alter an
activities of state machine important to proof of growth; in case we
have chosen a correct abstraction, we must be able to show that it
does not intervene in any way - result of any kind of intervening is
already considered to be a part of the same state group of a state
machine.
2. It must be proven that aspects of context of child machine, which
are important, are the direct result of the same logic by which it
produces a topologically identical child of itself.

Those two things met we know that the program will grow indefinitely.
This is - if parent is waiting for result of a child (because this
here is not enough to show that child's abstraction does not overlap
with parent, thus we are looking only for control flow and could
detect non-growing cycles as well with the same logic).

Interaction with neighbors and growth in memory

I will call two state machines neighbors if there is an interaction
between them. As we look any specific state machine only as long as it
lives (as long as we can do this abstraction), destroying a state
machine is a part of this interaction, but it's creation is not. Our
goal is to reduce the number of state machines so that we can include
their total logic into parents; by trying all different classes of
reductions, we will find out some stone-hard logic, which rules our
world.

To prove that a child abstraction will not share the same space with
parent abstractions which it's a part of, we need to prove, from top
to bottom, that on each level, the states mentioned do not cover the
same underlying reality. This is achieved if we index all state groups
and show that state groups of child have nothing to do with state
groups of parent. Anyway, in practical cases - when all parts of
program can access any part of memory - this is a hard case. We must
make all memory access relative so that we can show it in relation to
specific program parts; calculations with absolute memory might throw
us into huge complexity. Here comes the fact that children and their
children might actually all interact with some abstract machine, which
is common to all of them. The logic of such interaction can be reduced
to simple implications, where an interaction between that shared
neighbor and all machines will be reduced to some basic logic covering
pure implication rules - if that machine gets this signal from any
abstract machine covered by our area of interest, it will send that
signal to all of them. Then, by showing that none of them will send
the signal we know that none of them will get that other signal. In
such way we can reduce common neighbor out in some cases or include it
in our logic.

We can look the state sets of all children as one state machine, thus
reducing the work, which would otherwise grow infinite when we analyze
their potential states. We can show that several state machines form
one state machine, which in common, is identical to them in how it
interacts with it's neighbors - bigger state machine can form two
smaller state machines, which are identical to it by their logic (even
if they look quite different in memory). They can form an array of
state machines, which each create two nodes of themselves - then, in
case they do not spread some integer counter inside, they are
identical; in case they do, we will abstract this integer counter by
finding a new abstraction, which shows all possible interactions with
this counter. When we have found that we have abstracted the counter
out - each state of that counter produces two such counters on which
we can apply clear implication logic (that it goes infinitely) and
interaction rules (that it does not intervene with some of our
abstractions), we can look our abstractions as separate and conclude
that they produce two machines identical to them.

So, we can show that certain state machines will produce a world,
which takes some of their abstractions together (pointer to common
ancestor and thus some specific child of common ancestor; some state
machine they both interact with and which might be flagged as being
interacting with indefinite number of state machines of class X - it's
interaction rules will be calculated out).

Global analysis

To connect all children, we need a global analysis - knowing all
relations and state rules, we can logically reduce the scenarios in
such way that there will be less possible states and rules for all
abstract machines. Looking their interaction rules, which we have got
by analyzing each other separately, we will make all global reductions
and apply the results to each separate state machine. After that, we
can once again analyze each machine separately. We can grow in both
directions, getting more and more final conclusions.

What about halting problem unsolvability proof of Turing?

In halting problem, we can imagine two ways of program exit - if it
detects the cycle and if it does not.

The problem with unsolvability is that when we check our program, it
will mirror exactly what the detector does. Thus, to be able to solve
it's cycle, it must actually be able to break it, which is not
possible.

Therefore, we need here a mathematical formulation of certain kind of
potential to detect - the potential of such cycle. Because if we are
able to detect it, it will be only a potential, the program itself
could be perfectly correct in case it's meant to do something a bit
insane.

Anyway, we have a large group of programs on which one or another
uncertainty problem applies - as they result from Turing's proof.

Now, let's consider the following:

1. If we have an abstraction, which produces a child of it's own kind,
which does not share memory with it.
2. And the logic of parent depends on the results given by child.
3. And the results mentioned depend on some results given by children
of that child so that it worms a connected graph - each child produces
a result depending on result similar kind of it's child.
4. And activity of children is not dependent of some other,
potentially halting process, which could stop all those children
unless this potentially halting process is dependent on states of
those children themselves.

This is perfectly provable, for a general algorithm, if this is the
case.

Point four needs some discussion. In halting problem, specifically,
all parents are allowed to halt their children in case of some
specific condition met. We can and will reduce it into a fact that if
a child sends some signal to a connected machine, it will, at some
point in future, get a specific signal back. Halting signal (destroy)
is one of them, which might be propagated.

Then, I think, we have matched together a pattern of Halting Problem.
There is the case that the row itself could indeed finish it's work -
but that kind of interdependence should be visible in problematic
code. We can not trust any spatial differentiation (that child should
not share memory space with parent as it could), but we must trust
time differentiation - as child will give a result after it is called.
This is, by the way, perfectly compatible with my disproof of Turing's
proof (which I, anyway, have started to doubt anyway). It means that
only the topmost parent is allowed to detect the problem - there will
be infinite number of hypothetical children, who will just not reach
that point. This does not matter that those children would give some
result, basically the same result their parent does - first one wins
the battle here. Parent will always be first to detect the
interdependency.

Now ...isn't it so that our algorithm will throw out many programs,
which actually perfectly work?

This problem is rather complex one. As one thing for sure - such
children, which met the conditions, are not normal in common program.
I intuitively think so, but proof is needed.

How to represent those abstractions and their relations?

To get a good scheme of abstractions and relations, we use an array of
substates:
A1, A2, A3, ...

Each of them contains state rules, which global analyzer has found in
relation to parents.

Examples:

If we have one abstract state machine with no interaction with others,
this array contains one element.

If we have one abstract state machine with limited interactions - some
possible (the "possible" will be solved by global analysis) signal can
change it from state A to state B, but not otherwise, there is a slot
for this signal. Such slot keeps a null state unless state machine is
in state A, in which case it is in substate, which shows that A could
be transformed to B.

Those slots are named and their importance is in that there can be
empty slots not connected to outside world, which have states for
creating and maintaining children. This slot system is a good
representation - only through those slots can things be given to
others. To children, a state machine can give some specific slot or
slots or other machines connected with that slot. The slot system
allows to draw implications in formal manner. We can also specify
types of other machines behind those slots and we can show, in case
some machine is proven to be topologically identical to child, which
is connected to some slot, how the children of one machine are
connected with children of others. In case we match the topological
identicalness - some machine produces another or is connected to
another, which is topologically identical - the slot system provides
us the means to have pointers taking that all into consideration.

Handling time

If we have composed states, like AB, ABC or E, then we can have
composed state groups of those, too. As signals come and go in time,
our state machine must actually respond in such way that it is timely
- for example, it's state groups must, in case it can send a signal
each third time moment, somehow reflect this timing. Combined states
are a good way for that - the logic inside can go into combined states
in such way that for two time moments, all internal states are
formulated as state groups with two items or two states following each
others, but for each third frame, same internal states are formulated
with another state name.

For example, say that there are internal states A, B and C. Each third
time moment, outside world could potentially send a signal to
transform from B to A.

Say that there are rules that C=>A and A=>B and B=>C

Thus, we can have nine states:
A, B, C -> C=>D, A=>F, B=>F
D, E, F -> F=>G, D=>H, E=>I
G, H, I -> I=>A, G=>B, H=>C or H=>A

Or, alternatively:
AD, BE, CF -> CF=>H, AD=>G, BE=>G
G, H, I -> I=>A, G=>B, H=>C or H=>A

This group of state changes is identical to rules described first, but
they take changes in time into account. H=>C or H=>A is left open as
it depends on signal coming into specific slot from outside world -
and we know that the signal can come only every third time moment. We
transform our rules into reflect this fact so that we can handle the
whole. This becomes extremely important as every child abstraction
could add a bit to some timing - for example, combine things with A
(so that A becomes A or AA, B becomes B or BA, C becomes C or CA) or
do other resizes in time, when other things will be left impact; it
can also combine rules in different ways, also with other things
behind other slots. This is the nature of how we form abstractions -
we take new levels of abstractions and combine them. For example,
encrypted virtual machine will become, after abstracting it, identical
to the machine it's part of. Totally unrelated things might turn out
to be topologically identical. We must be able to handle to whole
composition of different cases - primarily find the rules, how one
abstraction becomes another, so that we can join the cycle and say if
something forms such cycle that what we have will produce a new
variation of what we have indefinitely.

Conclusion

The pieces we start from can be arbitrarly chosen, but they must be
some kind of state machines able to allocate more memory than they
currently have. 1D Cellular Automata I did choose is a good way to
achieve that - the whole world is a state system, but also a string,
which can be viewed linearly. I worked out a method to analyze the
state system and find all possible abstractions - this text here adds
a few as it covers the logic of Halting Problem and time.

|
Pages: 1
Prev: Call for Papers Reminder (extended): World Congress on Engineering and Computer Science WCECS 2010
Next: Call for papers: WSOM 2011, 8th Workshop on Self-Organizing Maps