From: Lurker on 9 Jul 2005 03:36 "Chris Sonnack" <Chris(a)Sonnack.com> wrote in message news:pa33c1d9e499utf78qcssrljgg8m2qmofi(a)4ax.com... > topmind writes: > > > Copy-and-paste actually *reduces* coupling because it lets things be > > independent, for example. Thus, if reducing coupling is always good, > > then copy-and-paste is always good. Sorry, but cut-and-paste is probably the worst thing that has ever happened to the programming skills. Consider: if you have to type the same thing more than twice, you would probably think more than twice before doing it yet again and find a way to refactor it. That's more or less a given considering the human nature. However, if it takes just a few key-strokes to copy that block of code, that's probably what will happen and damn the torpedoes (err, I mean consequences) That's human nature too.
From: topmind on 9 Jul 2005 14:43 Chris Sonnack wrote: > topmind writes: > > >>> I have agreed that *people* naturally relate to trees. > >> > >> First, why do you think that is? > > > > That is a very good question that I cannot fully answer. Ask the chief > > engineer(s) of the brain. > > [grin] They don't seem available for comment. We'll just have to guess. > > > I suspect in part it is because trees can be well-represented visually > > such that the primary pattern is easy to discern. I say "primary" > > because trees often visually over-emphasis one or few factors at the > > expense of others. > > You make a lot of statements you don't support with examples. Can you > provide an example of this happening? Here is an example borrowed from c2.com. We taxonomize drinks by making each orthogonal factor a level: DRINKS Coke --Diet ----Caffeinated ----No Caffeine --Regular ----Caffeinated ----No Caffeine Lemon-Lime soda --Diet ----Caffeinated ----No Caffeine --Regular ----Caffeinated ----No Caffeine Iced Tea --Diet ----Caffeinated ----No Caffeine --Regular ----Caffeinated ----No Caffeine Any one of the three factors (flavor, dietness, caffeiness) could be put at the top level, but we cannot put them all there. Thus, we must make one the "king" arbitrarily and demoting one arbitrarily to the leaves. Also note what happens if you add a 4th factor to this tree. > > I think the reason is fairly simple. But I think we need to draw a > distinction between representing DATA in a tree and representing some > taxonomy as a tree. I (and others) have agreed that large taxonomies > (unless the environment is very stable and clearcut) can be a problem. > (I'm not sure that tables necessarily fix those problems, however, the > problems come more from the difficulty of classifying things, IMO.) I agree that classification is difficult. However, sets are more forgiving of getting it "wrong" up front. Trees make one sweat to "get it right" up front because the penalty for design mistakes is big and hard to undo. > > However, when it comes to representing certain *types* of datasets, I > think that trees are the most appropriate form. > > Here we seem to be talking about tree-shaped *data*, each datum being > a step in a larger task. It was entirely natural for you to break it > down as an outline (aka tree) because that's exactly the shape of the > data. Yes, but that is only on a small scale. The flaws of trees are less of a problem at a small scale. If you need to refactor code in trees, it is often fairly easy because it can fit on one page/screen. > > People find this useful and natural for the simple reason that it allows > you to focus on the level of detail desired at the moment. That's the > whole point of an outline. Well, not everything divides nicely into "levels". Plus, one can put levels into sets if they want, but usually one finds less of a need to. > > File systems are hierachical for the same reason. It allows you to > partition your data (files) into useful categories. It also allows you > to perform operations on a sub-set easily without needing to access or > filter the rest. I already railed against hierarchical file systems somewhere around here. I would like to see them replaced or augmented with relational techniques. > > Companies and the military are hierarchical for a similar reason: the > "higher ups" deal with the big picture, the "low downs" deal with the > details. I also gave an actual situation around here where what initially looked like a org taxonomy for budgeting was not. The accountant (user) agreed that we had to toss trees and move to sets. I also know people who have two bosses. > > > Sets can be used for visual inspection also, but it takes some training > > and experience to get comfortable with them. Part of this is because > > one must learn how to "query" sets from different perspectives. There > > generally is no one right view (at least not on 2D displays) that > > readily shows the pattern(s). One has to get comfortable with query > > tools and techniques. > > Of course. Sets are raw, unstructured data. I disagree with those labels, but realize they don't have a clear definition in this case. Trees are often a "psuedo-structure": they lead you to believe information is structured, but in reality things may still be a mess. Many file hierarchies at big companies are like this: they are a disaster with loooooong tangled directory paths that are nearly unfixable because of all the paths used by other apps to reference stuff. > > And reading raw data can fail to carry real meaning, hence the usefulness > of charts and graphs. Very often, when dealing with a tabular dataset, > I find myself quite surprised to see what the data is really saying once > I start mining it with graphs and queries. Right! That is why we have query and viewing tools. In the real world we need to view things from different angles for different needs. Now, you may argue that trees can be used without fancy viewers and query tools, which I generally agree with. But to get *real* power you need real tools. Trees are for boys, sets are for men! > > >> Second, *people* create programs. Therefore, obviously, the tree must > >> be one of the natural models. > > > > Do you mean natural to the human mind or natural to the universe such > > that all intelligent space aliens will also dig them? > > Tree structures are a (universally) natural data type, so I do believe > all intelligent minds will discover and use them. But an intelligent mind also knows when they have reached their limit. Trees are a poor relativism tool: they generally can't change to emphasize different viewpoints. You get one view and that is it. Reminds of the Model-T paint story. > > > Being natural does not necessarily mean "good". For example, it is > > natural for humans to like the taste of sugar, fats, and fiber-free > > starch. However, too much of these often make us unhealthy. > > Too much, yes. But without them, we perish. Same with trees. Some/small is good, lots is bad. > > > Similarly, I agree that trees provide a kind of *instant gratification*. > > It is the longer-term or up-scaling where they tend to bite back. Thus, > > sets are for the more educated, enlightened mind in my opinion. > > I understand that's your opinion. It's unsubstantiated in mine. > > > >> First, as I've already pointed out, cross-branch links don't make a > >> tree not a tree. They just make it not certain types of trees. > > > > "Type of tree"? Please clarify. > > There are binary trees and n-ary trees. There are trees that are > balanced and unbalanced. There are trees that require all nodes to > be unique and trees that allow nodes to be repeated. There are > trees that allow nodes to reference each other (consider links in > a Unix filesystem). There are trees where the branches are allowed > to reconnect. Those are not technically trees. Links in the unix file systems break trees. > > A tree is really any structure with a "root" and child nodes. Cross links can cause ambiguous roots and children. > > > If you print out the code on a big sheet of paper and draw the lines > > between the subroutine references, you will see that it is *not* a > > tree. > > In most programs of my experience (30 years, dozens of languages), there > is a fairly strong tree shape to the relationship between functions. Only at the highest level. When you get into libraries and frameworks, the distinction gets fuzzy. Plus, event-driven UI frameworks are generally not like that. One is dealling with a pretty flat view of event modules. > You have your root--the code entry point--and a set of nodes that, in > a well-written program--tend to proceed from high level nodes to low > level nodes. > > A fellow I used to work with wrote a program that, given the source > files for any C program CREATED the tree of routines and calls. The > only thing you really have to be aware of is recursion, and that pretty > much just requires not repeating an existing node (or in some call > graphs I've seen, repeating it only once as a leaf). Interactive software usually has a lot of indirect recursion in my experience. > > > Now, *any* graph can indeed be represented as a tree if you are willing > > to display inter-branch references as duplicate nodes, which is what > > the multiple subroutine calls essentially are. > > Or (as this fellow did) as branches that refer back to existing nodes. Well, it is not a true tree then. I don't dispute that one can force a tree out of almost any graph, but it requires duplication in leaves. If you don't give a flying sh8t about duplication, yes, you can get a bigass tree out of it. But it is often of limited use. > > And consider this: the call path of a program executing (in a single > thread environment) is 100% tree-shaped (with recursion caveat). That > is, if you graphed it from start to finish, you'd end up with a huge, > perfect tree. No! Repeat subroutine calls bust pure tree-ness. > > And, yes, some nodes might be repeated, but that doesn't make it not > a perfect tree (by perfect I mean no nodes cross-link). The parse tree > created by a parser repeats nodes--for example, in most parse trees, > there are probably a LOT of "if/else" nodes. > It is all a matter of viewpoint. It *can* be represented as a big tree with dup nodes. We both agree on that. But a pure tree has no duplicate nodes. If you have a lot of duplicate notes, then that is a sign that a tree is not the best representation. The tree becomes far bigger than the actual thing it is representing because of the duplication. That is probably why your friend's C-to-Tree tool did not catch on and sell like Mustangs. I have (semi) parsed code into databases before and could have used that info to also print a giant tree (with a max level quota) using recursion, but fealt it had insufficient use. A better use of such info is to query for routines, get a list of calls to other routines, and click on those links as needed. In other words, as-needed navigation of the "call graph". > > >> Second, you don't really algorithm, you mean program. Algorithms > >> typically ARE very hierarchal ever since structured programming was > >> invented. Smaller blocks within larger blocks, just about every > >> algorithm is structured that way. > > > > On a small scale, yes. On a bigger scale it is not hierarchical, per > > above. Many programs are generally hundreds of small trees joined by > > lots of references that bust tree-ness. > > Well-written programs ARE usually tree-ish in my experience. Not in my experience. But it does depend on the nature of the app. Batch processes tend to be more tree-ish than good interactive software in my experience. > High level > routines call low level routines. I'm sure you've heard the terms > "top down" and "bottom up". These refer to the tree-ish-ness of program > analysis and design. Yes, and their over-use has been attacked by various gurus, including Bertrand Meyer. (Meyer suggests abandoning procedural as the solution, but toning down the tree-ness is an alternative he ignores.) > > > Picture drawing hundreds of small trees randomly scattered on a large > > piece of paper. Now randomly draw thousands of lines between various > > nodes of the different trees.... > > But it's NOT random. Low level routines don't call high level routines. > Well, often routines don't fit into such a clean classification of low and high level. Event-driven programming is an example. > > >> You can label them a "lie" if you like, but they appear to reflect the > >> truth of very many situations to me. > > > > Well, every abstraction is probably a lie to some extent. > > The term "lie" is pejorative and, I think, misleading. A lie is told with > the *intent* to deceive. Abstractions are told with the intent to get at > a truth. No, sometimes they are to simplify the human view of things. DNA represented as combos of 4 letters is not more truthful than representing it as molecues, just easier. > If you really feel tree-shaped abstractions are incorrect, a > better term might be, well, "incorrect". (-: But "useful lie" implies that it was chosen for human convenience, not because it better reflects reality. > > > >> Clearly most of the human race would seem to agree, since you admit > >> that "most people would naturally produce a tree." > > > > Because most people are naive and go with instant gratification and/or > > the comfortable. Sets are admittedly not immediately comfortable, but > > they allow one to shift up a level of complexity and change management. > > Enlightenment comes with a price. That's life. > > Hmmm, the "everyone else is dumb but me" line is so often the mark of a > kook that I recommend you be careful about it. And, for my money, you're > just plain wrong. Well, that is just my frank assessment. I call it as I see it. I think if developers used set-friendly techniques over a longer period of time and more set-oriented tools were created to aide them, they would eventually see the power and flexibility of sets. Many developers think in terms of text files. I was fortunate to be exposed to nimble table tools early in my career, and these changed my viewpoint of software design. Plus, the justification for trees has generally been week. It mostly seems to be human comfort, not about reducing duplication or being change-friendly. In other words, the case for trees has been weak. If I am wrong about them, it is because I have not been given objective evidence of their value. If you can show trees being objectively better for common stuff, please do. Your duplication of nodes to keep the tree view is objectively poor info factoring. > > Sets are raw, unstructured data. No, as described above. They are "unstructured" to you because you are uncomfortable with them. > You've agreed that making sense of them > requires tools that add, or highlight, structure. Also replied above. Yes, they require more powerful tools to use effectively, but most economic and technical progress does. Trees are for the Omish. I'm gonna plug into the power and zoom. I remember an ex-Disney cartoonist saying to me about 7 years ago: "Screw these newfangled animation computers. I can carry pencils and paints in a box and use them whenever I want without big expensive heavy computers and without electricity. I don't have to reboot my pencil or put anti-virus software on it." > > That's exactly what hierarchies do. One could just as easily claim that > enlightenment involves embracing all forms of tools that allow you to > mine datasets for their value. One might also argue that the naivete is > in failing to recognize the value of higher data structures. Please define "higher order structure". > > -- > |_ CJSonnack <Chris(a)Sonnack.com> _____________| How's my programming? | -T-
From: Michael Feathers on 9 Jul 2005 15:06 >>>I suspect in part it is because trees can be well-represented visually >>>such that the primary pattern is easy to discern. I say "primary" >>>because trees often visually over-emphasis one or few factors at the >>>expense of others. >> >>You make a lot of statements you don't support with examples. Can you >>provide an example of this happening? > > > Here is an example borrowed from c2.com. We taxonomize drinks by making > each orthogonal factor a level: > > DRINKS > > > Coke > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > Lemon-Lime soda > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > Iced Tea > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > > Any one of the three factors (flavor, dietness, caffeiness) could be > put at the top level, but we cannot put them all there. Thus, we must > make one the "king" arbitrarily and demoting one arbitrarily to the > leaves. Why not? You can have three hierarchies: flavor, caffeination, sweetening. You make a drink by composing them: Drink drink = new Drink(); drink.addFlavor(new Cola()); drink.addCaffeination(new TenPercentCaffeination()); drink.addSweetening(new Diet()); Adding a fourth factor only impacts the Drink class, and you can create new types of, say, Sweetening and make arbitrary drinks with them: drink.addFlavor(new GrapeFruit()); // yeech! drink.addSweetening(new Splenda()); The problem is that you only see one tree. You can have as many as you want and that is what makes OO design resilient to change. No, if all you see is one tree, no wonder you think OO is for the birds. Michael Feathers author, Working Effectively with Legacy Code (Prentice Hall 2005) www.objectmentor.com
From: Robert C. Martin on 9 Jul 2005 18:58 On 9 Jul 2005 11:43:39 -0700, "topmind" <topmind(a)technologist.com> wrote: >Here is an example borrowed from c2.com. We taxonomize drinks by making >each orthogonal factor a level: > >DRINKS > > > Coke > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > Lemon-Lime soda > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > Iced Tea > --Diet > ----Caffeinated > ----No Caffeine > --Regular > ----Caffeinated > ----No Caffeine > >Any one of the three factors (flavor, dietness, caffeiness) could be >put at the top level, but we cannot put them all there. Thus, we must >make one the "king" arbitrarily and demoting one arbitrarily to the >leaves. > >Also note what happens if you add a 4th factor to this tree. Using inheritance to represent degrees of freedom is not always a good idea. In this particular case we are probably better off using composition: |Stimulant|<---------|Coke|-------->|Sweetener| A A | | +-----+---------+ +--------+---------+ | | | | |Caffeine| |NullStimulant| |Sugar| |Nutrisweet| This is the GOF "Bridge" pattern (or strongly related to it). Trees aren't always the right design solution. OO is not primarily about making trees. You *can* make trees with OO, but you don't have to; and you often shouldn't. ----- Robert C. Martin (Uncle Bob) | email: unclebob(a)objectmentor.com Object Mentor Inc. | blog: www.butunclebob.com The Agile Transition Experts | web: www.objectmentor.com 800-338-6716 "The aim of science is not to open the door to infinite wisdom, but to set a limit to infinite error." -- Bertolt Brecht, Life of Galileo
From: topmind on 10 Jul 2005 00:53
> The problem is that you only see one tree. You can have as many as you > want and that is what makes OO design resilient to change. > > No, if all you see is one tree, no wonder you think OO is for the birds. > The context is trees. I did *not* say OO *had* to use simple polymorphic trees. Plus, a network of trees turns into a navigational mess. Dr. Codd already won the war against navigational structures. Stop beating a dead horse. You lost in 1972 when Charles Bachman was thankfully creamed. Now take your confederate flag and go home. Further, you hardwire your messy taxonomies into code. Databases are a more flexible tool to manage and study noun classifications. And, they can share them with multiple languages. -T- |