Data Flow languages and programming

Prev: simple: P versus NP
Next: 19990823: General announcements. Goedel's question: if a theorem has a proof of length at most n, can we find it in time O(n^2)? Another question on what can be computed in limited time and space.

From: Dmitry A. Kazakov on 30 Oct 2009 11:23

On Fri, 30 Oct 2009 12:19:46 +0100, Pascal J. Bourguignon wrote:

> Nick Keighley <nick_keighley_nospam(a)hotmail.com> writes:
>
>> It would be nice if there were
>> a graphical differences tool- that is one that would didplay the
>> picture and highlight the differnces. Not undoable I'm sure.
>
> Indeed. Have a look at:
> http://www.informatimago.com/develop/pic-merge-diff3/index.html
> A trivial patch to this script would highlight the differences.
>
> (Unfortunately, it's only a pixel diff/merge operation, we would have
> to use more sophisticated algorithms to detect in a concise form block
> transformations; but even in this case, this gimp script is rather
> effective, visually).

Yes, as someone with image processing, pattern recognition and AI
background I can tell you, there is no such algorithms. You will need a
complex scene analysis in order to get at the level of a diagram. Even this
does not work. Because image segmenting and line detection do not really
do. Such a trivial task, but alas, no algorithm can compete human in image
segmenting. But even if you passed through, got geometric regions, lines
etc. To analyse their connections, shapes (e.g. the scene), that does not
work. Diagrams generated by GUI tools skip these steps, because the tool
places this information into the intermediate file.

The point is that the task is immense and though we have solved it during
our evolution, my guess is that it leaves no free "computational" resources
to analyse the picture at a much detailed level. It wasn't evolutionary
needed too.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

From: Richard Harter on 30 Oct 2009 23:49

On Tue, 27 Oct 2009 10:09:25 -0400, Joshua Cranmer
<Pidgeot18(a)verizon.invalid> wrote:

>On 10/26/2009 06:22 PM, Richard Harter wrote:
>> The antithesis is that it isn't turtles all the way up, or at
>> least it shouldn't be turtles all the way up. That is, the kinds
>> of languages and programming technologies that we use in
>> large scale programming should be quite different from the kind
>> of languages used for programming in the small.
>
>I think most people would agree that methodologies for small-scale
>programs are different from those for large-scale programs, although
>there might be considerable disagreement over the dividing line. I
>certainly am not going to be used a full-blown OOP paradigm for writing
>even something like an automatic code generator; on the other hand, I
>shudder at the thought of writing something so complex as an operating
>system in a functional paradigm.

IIANM people have written operating systems in functional
languages without difficulty.
>
>An interesting project would be to take a large corpus of various mature
>open-source programs of varying size and see if there is a correlation
>between size and the use of certain language features or patterns.

Perhaps this would be an appropriate project for CS PhD
candidates.

>
>> One difference between traditional programs and data flow
>> programs is that traditional programs use LIFO semantics whereas
>> data flow programs use FIFO semantics. That is, a traditional
>> program puts data on a stack and gets data back on the same
>> stack. In data flow programs each element gets data from a queue
>> and puts data to other queues.
>
>One nit to point out: at this stage, many programs don't follow a
>procedural model of programming. OOP is the dominant paradigm, and I
>don't see the sequence of data flow as being LIFO. Yet you continually
>refer to procedural programming as those that make up `traditional
>programs.

To nit a nit - the word "procedural" doesn't appear at all in the
article. I used the term "traditional imperative". More to the
point, data flow is LIFO. By this I don't mean all data flow;
rather I am referring to the data passed to functions/methods via
calling sequences and the data returned from them via return
statements. Likewise the flow of control is LIFO. That is, when
a function/method exits it returns control to the place it was
called from.

>> Another difference is that the connectivity of traditional
>> programs is deeply embedded in the code. To pass data from A to
>> B, A calls B. That is, the caller has to specify where the data
>> goes. In data flow programs the connectivity can be separate
>> from the code. A doesn't pass data directly to B; instead it
>> passes data to the run time system which in turn passes the data
>> to B. The caller does not have to specify where the data goes.
>
>Two words: function pointers. You can also go with virtual or dynamic
>method dispatch, but function pointers is shorter and sounds better.

Well, no. Function pointers et al are still specifications of
destination. More than that, A actually sends the data
immediately to B via a call. Consider the following two lines of
code which look very similar:

func(x); /* func is a function pointer */
send(x); /* send is a messaging primitive */

In the first line "func is bound (directly or indirectly) to the
actual function that will act on x. In the second "send" is not
bound to the function that will act on x.

>
>> * Concurrency and parallelism are natural. Code can be
>> distributed between cores and even across networks. Many of the
>> problems associated with threads disappear.
>
>They don't disappear, they're pushed into the runtime system. From
>practical experience, that means they probably bubble up and annoy you
>in the programming language.

What kind of practical experience have you had with data flow
languages?

>
>> * Data flow networks are natural for representing process. This
>> is particularly true for transaction processing.
>
>Maybe I'm just being a stupid idiot here, but I don't see how data flow
>is natural for representing some common processes. For example, the
>HTML5 tokenization process. I suppose you can flow current-state output
>back around and into itself as next-state input, but that's not exactly
>natural.

I sounds as though you're asking how one would implement a state
machine in a dataflow language. It would be straightforward
enough, but there's not much point in doing it unless it gives
you cheap thrills.
>
>This also brings up a question of how the language deals with loops in
>the dependency graph. The solutions I see either bring back the problems
>with threading or could create unreliability.
>
>> * Message passing gets rid of the problems associated with shared
>> memory and locks.
>
>Just as credit default swaps and other financial machinations got rid of
>risk :-).

It's the other way around. Threading, shared memory, mutexes,
and locks are the equivalent of default swaps. :-)

The argument really is straightforward. Data flow languages
typically require that shared memory be immutable. If it's not
immutable then it's not shared. This works. There are prices.
Mostly they are performance prices.

>From what I can tell, similar problems arise, but they're in a
>different form (data flow dependency graphs, particularly the fact that
>they're typically not acyclic).

Perhaps you could give an example of what you are thinking of.
It is not quite clear to me why you think this is a problem.

>
>> * Data flow programs are more extensible than traditional
>> programs. Elements can be readily composed into composite
>> elements from the bottom up rather than top down.
>
>Even if you consider good old procedural programming, I would say that
>the exact same thing is easily doable in traditional programs. I can
>take code that does asynchronous socket reads and turn it into a MIME
>decoder by creating a module that calls the asynchronous socket read
>module appropriately. It's also an example of your "bottom-up composition."
>
>> Some significant disadvantages:
>>
>> * Using data flow programming in the large pretty much requires
>> that it be used from the start. That is, converting traditional
>> programs into data flow programs is difficult because the
>> structuring is so different.
>
>I don't see it that way. Ultimately, most libraries are merely "pass X
>in as input, get Y out as output"--it shouldn't be too hard to make that
>an atomic data-flow program block.

The problem doesn't lie in the libraries; it lies in the
superstructure and program organization.

Good comments, thank you.

Richard Harter, cri(a)tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Kafka wasn't an author;
Kafka was a prophet!

From: Nick Keighley on 31 Oct 2009 17:38

On 30 Oct, 15:23, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:
> On Fri, 30 Oct 2009 12:19:46 +0100, Pascal J. Bourguignon wrote:
> > Nick Keighley <nick_keighley_nos...(a)hotmail.com> writes:

> >> It would be nice if there were
> >> a graphical differences tool- that is one that would display the
> >> picture and highlight the differnces. Not undoable I'm sure.
>
> > Indeed. Have a look at:
> >http://www.informatimago.com/develop/pic-merge-diff3/index.html
> > A trivial patch to this script would highlight the differences.
>
> > (Unfortunately, it's only a pixel diff/merge operation, we would have
> > to use more sophisticated algorithms to detect in a concise form block
> > transformations; but even in this case, this gimp script is rather
> > effective, visually).
>
> Yes, as someone with image processing, pattern recognition and AI
> background I can tell you, there is no such algorithms. You will need a
> complex scene analysis in order to get at the level of a diagram. Even this
> does not work. Because image segmenting and line detection do not really
> do. Such a trivial task, but alas, no algorithm can compete human in image
> segmenting. But even if you passed through, got geometric regions, lines
> etc. To analyse their connections, shapes (e.g. the scene), that does not
> work. Diagrams generated by GUI tools skip these steps, because the tool
> places this information into the intermediate file.

I had in mind an easier problem than comparing two pixel collections.
I was assuming the diagram was a collection of shapes (typical sort of
thing graphical design tools mess around with). I was hoping such
diagrams could be compared fairly easily, though in another post you
dash this hope! :-(

> The point is that the task is immense and though we have solved it during
> our evolution, my guess is that it leaves no free "computational" resources
> to analyse the picture at a much detailed level. It wasn't evolutionary
> needed too.

--
Nick Keighley

From: Nick Keighley on 31 Oct 2009 17:51

On 30 Oct, 15:21, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:
> On Fri, 30 Oct 2009 03:32:30 -0700 (PDT), Nick Keighley wrote:
> > On 29 Oct, 18:15, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> >> On Thu, 29 Oct 2009 05:10:37 -0700 (PDT), Nick Keighley wrote:
> >>> On 27 Oct, 08:55, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>

> >>>> * Unmaintainable code. Look at large data flow programs (e.g. in DiaDem,
> >>>> Simulink, LabView). It is impossible to use reasonable software processes
> >>>> on them. How to compare two diagrams?
>
> >>> convert it to a textual represention then run diff on it. I'm not
> >>> saying it's trivial but I don't think its intractable either.
>
> >> Are pixel positions and sizes of the blocks relevant to the comparison?
> >> (:-))
>
> > I presume the smiley means you can answer that yourself...
>
> No, equivalence of two directed graphs is not a simple task.

oh, shame. I was perhaps hoping it was doable if the two diagrams were
"similar"

> >> The very argument that a text form is somewhat better raises a suspicion
> >> why not to use it from the start?
>
> > where did I say better? They have different strengths and weaknesses.
> > Programmers I know spend a fair amount of time drawing strange
> > diagrams on whiteboards. Personnally, I think they should then throw
> > the diagrams away and code it up but a lot of people like diagrams.
> > You can have both.
>
> Yes, physicists draw lots of diagrams too. Nevertheless the language of
> physics is differential equations. The role of diagrams is always
> supplementary, they cannot serve as a language.

someof the UML people seem to think it can. (Perhaps UML + something
else as I'm not sure UML itself has any semantics).

I know I'm posting from ignorance here, but I'd be interested in
learning something

> > yes, but I fail to see why the graphical form presents any different
> > problems from the textual form.
>
> It is easier to analyse textual form for both humans and computers.
>
> >> Ah, that is maybe because there were only alphanumeric display device back
> >> then? (:-))
>
> > blinking lights and switches. If you can't read the machine code in
> > binary you aren't a Real programmer. Octal is for quiche eaters.
>
> Yep, that was fun. You could immediately see if the program ran in a cycle,
> the pattern repeated itself. Blue screen, that's not suckers! Real men read
> the program counter of the crash location looking at the front panel LEDs!
> (:-))

LEDs?! foo use the real thing, Nixie tubes

From: mike on 1 Nov 2009 22:48

In article <hcbp8d$7pv$02$1(a)news.t-online.com>, mok-kong.shen(a)t-
online.de says...
> mike wrote:
>
> > I think that most of the problems inherent in any large-scale
> > programming project result from the inherent 'fragility' of all
> > programming languages.
> >
> > If you compare a large computing project with a large engineering
> > project there are clear similarities, but one very significant
> > difference is that almost any undetected error in code has the potential
> > to result, somewhere down the line, in catastrophic outcomes; whereas if
> > a nail is not quite hammered in as far as specified or if a light
> > fitting (or even a window) is put in the wrong place then the building
> > usually remains safe, functional, and fit for purpose.
> >
> > If someone has some ideas about a programming language/paradigm that is
> > fault-resistant, not only in the sense that it reduces the number of
> > actual bugs and/or errors produced, but also in the sense that many bugs
> > have no significant effect on function and behaviour then large-scale
> > projects may be a lot easier to manage.
> >
> > Whether this will ever be a possibility remains, in my opinion, unknown
> > to date.
>
> In engineering works there are "factors of safety" to take account
> of variability of materials, unpredictability of actual loadings
> and inaccuracies in construction etc. etc. In computing, analogous
> has been done. For results for control of critical (including
> in particular potentially dangerous) processes, one employs
> double or triple hardware to take care of the possibility of
> malfunction of hardware. As far as I know, in order to better
> detect programmer errors, one similarly employs different and
> independent teams of programmers to do the same job and then with
> test cases to compare the results of the resulting programs.
> However, if I don't err, this comparision is only done in the design
> phase of the software and later only the work of one of the teams
> is selected for practical application. A safer way, I think, would
> have these presumably equivalent software (resulting from different
> teams, employing preferably different programming languages and
> environments) in actual production runs to always work parallelly
> on multiple hardware (of possibly different types), so as to futher
> reduce the risk of errors, since testing of software in the design
> phase might not be thorough enough to uncover all errors that may be
> present. Of course, errors could not be "absolutely" eradicated,
> in accordance with Murphy's Law.
>
Yes - it is true that paralelling the hardware and software with result
comparison migh help reduce software errors, but I think it may be a
while before I can load up a [Windows/Linux/Chrome] operating system on
my new [AMD/Intel/Via(?)] chipset...

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: simple: P versus NP
Next: 19990823: General announcements. Goedel's question: if a theorem has a proof of length at most n, can we find it in time O(n^2)? Another question on what can be computed in limited time and space.

Data Flow languages and programming - Part I