Polymorphism sucks [Was: Paradigms which way to go?] [OOP]

From: Thomas Gagne on 15 Jun 2005 13:02

Of course, it just occurred to me that some type systems can't predict
what the intersection of two shapes might be. When the intersection
between two shapes is another defined shape it would be a good thing to
return that shape rather than an "intersection". But that would depend
on the design of the hierarchy and interfaces so the object returned
from an intersection can be whatever it needs to be.

+----+
| |
| +-+---+
| | | |
+--+-+ |
| |
+-----+

The intersection above is another rectangle so why return an
intersection when I can return a rectangle?

From: Robert C. Martin on 15 Jun 2005 13:09

On Tue, 14 Jun 2005 19:01:53 +0100, Miguel Oliveira e Silva
<mos(a)det.ua.pt> wrote:

>Although both approaches (OO and functional/procedural) are in a
>sense dual (as you correctly mentioned), the duality is not balanced
>at all. An ADT need not to know all of its methods possible
>implementations (it may even not know one!). On the other hand,
>the "switch" alike method approach needs to know most (if not all)
>of the possible data types.

Consider a function named "rotate()" that has a switch on shapes.
What does this function do for circles? Nothing.

There are times when adding a new data structure to a switch
architecture has no impact. There are times when adding a data
structure to an OO architecture has no impact. I think you'd be hard
pressed to prove that the duality is not balanced.

-----
Robert C. Martin (Uncle Bob) | email: unclebob(a)objectmentor.com
Object Mentor Inc. | blog: www.butunclebob.com
The Agile Transition Experts | web: www.objectmentor.com
800-338-6716

"The aim of science is not to open the door to infinite wisdom,
but to set a limit to infinite error."
-- Bertolt Brecht, Life of Galileo

From: Robert C. Martin on 15 Jun 2005 14:16

On 14 Jun 2005 13:21:50 -0700, "topmind" <topmind(a)technologist.com>
wrote:

>I agree there is a trade-off, but my observation is that change favors
>case statements most of the time.

That's where we disagree. I think change favors neither approach, and
that any approach that favors one technique over the other is
necessarily imbalanced.
-----
Robert C. Martin (Uncle Bob) | email: unclebob(a)objectmentor.com
Object Mentor Inc. | blog: www.butunclebob.com
The Agile Transition Experts | web: www.objectmentor.com
800-338-6716

"The aim of science is not to open the door to infinite wisdom,
but to set a limit to infinite error."
-- Bertolt Brecht, Life of Galileo

From: topmind on 15 Jun 2005 14:21

> >> >> And bound to the RDBMS.
> >> >
> >> >You OO'ers always make that sound like a bad thing. Youses are
> >> >DB-phobics. I don't want to be bound to OO, can we wrap that away too?
> >> >(I agree that the DB tools are sometimes lacking in implementation and
> >> >portability).
> >> >
> >> >> If we had done that, our tests would be too
> >> >> slow, and we would have to ship the system to our users with some kind
> >> >> of RDBMS attached.
> >> >
> >> >My RDB-bound wiki was not "slow".
> >>
> >> You misunderstand. I run over 1,000 unit tests, and over 100
> >> acceptance tests on the system. These tests combined take about 90
> >> seconds to run (on a bad day). The reason for this speed is that I
> >> use the in-memory version of the page object.
> >
> >I am not sure what your point is.
>
> My point was that I was not suggesting that your RDB-bound wiki was
> slow. I *was* suggesting that my tests would run slower if they used
> a database.

Outside of performance testing, since when does testing have to be
lightning fast?

>
> >Do you have tests that show
> >a RDBMS implementation would not scale as well?
>
> No, it didn't seem necessary. I think I could finish running all the
> unit tests before the RDB had finished building a connection. (slight
> exaggeration).

OO fans are always complaining about how slow RDBMS's allegedly are.
Even in the 386 days they didn't seem slow to me if you know how to use
them right (at least not slower than the competitors). In fact they
make some things faster because they tend to divorce usage from
representation such that new usages for the same information is not
bound by designs that only favor past features. Navigational structures
(which OO is generally based) have always had this flaw. This is
because relational schemas are driven mostly by factoring issues
(once-and-only-once) and not on usage-of-the-moment.

>
> >> >
> >> >Anyhow, one can do something like this:
> >> >
> >> >function getWikiArticle(articleID) {
> >> > if (sys::driver==RDBMS) {.....}
> >> > elseif (sys::driver==Files) {.....}
> >> > elseif (sys::driver==RAM) {.....}
> >> > else {error())
> >> >} // end-function
> >>
> >> True, one could do that. However, that makes a generic function
> >> (getWikiArticle) depend upon three different implementations. That
> >> kind of coupling is unfortunate. I'd rather have the getWikiArticle
> >> function not know about RAM, RDBMS, and FILES, and that's what
> >> polymorphism gives me.
> >
> >If we had hundreds of different "drivers" I could see the advantage of
> >polymorphism. However, that is not likely in this case. Companies don't
> >want to pay for multiple implementations of the same thing in most
> >cases and RDBMS are a safe bet for most apps in my domain. It would be
> >a poor investment to target the 1% who may want to use flat files
> >because they hate Dr. Codd.
>
> Nobody is talking about hating RDBs. In any case the cost of the
> polymorphism was negligible,

But not free. Example:

Without wrappers:

sql = "select * from products where color='pink' and price < 30 and
weightLbs > 10";
rs = executeQuery(sql, stdConn);
while (getNext(rs)) {processRecord(rs);}

With wrappers:

rs = GetAllPinkThingsCostingLessThanThirtyAndWeighMoreThanTen();
while (getNext(rs)) {processRecord(rs);}
....
function GetAllPinkThingsCostingLessThanThirtyAndWeighMoreThanTen() {
sql = "select * from products where color='pink' and price < 30 and
weightLbs > 10";
rs = executeQuery(sql, stdConn);
}

If you use such a query more than once, it may make sense to wrap it in
methods or functions. However, in my experience only about 30% of all
queries are sharable if parameterized. Most are specific to a given
task.

Thus, putting polymorphic wrappers around such is not free. It only has
an immediate benefit for duplicate usage.

> and it quickly enabled our user to add
> the RDB plug-in when he needed it. Without that polymorphism, his
> task would have been quite a bit more difficult.

The trick is to start out with a RDBMS so that you don't have to keep
upgrading your attribute management system until you end up with a
database-like-thing anyhow via the long and winding code-centric way.

I often catch OO anecdotes showing off how great OO is at reinventing
the database. This is moot if you start out using one instead of
reinventing one.

I only know of one case where I was told we had to go back to a
file-based implementation from an existing RDBMS. Thus, paying the
indirection tax for every app does not add up.

>
> >And, I am not fully clear on what you mean by "depend upon".
>
> Given two binary modules A and B. If A mentions B, or a unique name
> that lives within B, then A depends on B. This means A cannot be
> deployed without B. It also means that if B changes, there is a
> strong likelihood that A will need to be recompiled and redeployed.
>
> Our user who produced the RDB plug-in for FitNesse was able to create
> a new binary module that FitNesse did not depend upon. He then
> modified a config file that told FitNesse the name of that binary
> module. FitNesse used that binary module instead of it's internal
> modules for page persistence.

Polymorphism tends to favor "sub-type" additions while procedural will
favor functional additions. You already agreed with this before IIRC.
We are only hearing about sub-type grained scenarios because success is
everyone's child while failure is an orphan. It is possible to favor
both with a heck of a lot of indirection (in any paradigm), but
indirection is *not* free, as already described.

>
> Your point above about polymorphism being useful if there were
> hundreds of different drivers comes into play here. We don't know how
> many drivers there are for FitNesse. There *may* be hundreds of them.
> The use of polymorphism has enabled our users to create them, without
> forcing them to modify the source code of FitNesse to do so.

Having hundreds is probably a one-in-ten-thousand chance. And,
indirection is not free. Why bloat up existing code by 15% for a
one-in-a-thousand chance? It is expensive meteorite insurance.

And the few times I have encountered that do have a device-drive-like
pattern, it is rarely a clean tree. For example, it might turn out that
subtype C and D share one implementation, but sub-type F and G share
another, but R uses neither. We cannot use inheritance there because
there are two shared items instead of one. We can split the tree into
more levels, but then yet more features won't fit the existing level.
What is shared is at a smaller or different granularity than the
sharing pattern for other items. The tree for feature X may not
resemble the tree for feature Y.

It has come clear to me over the years that set theory better fits the
actual pattern of differences between things. The more complex
something grows, the less tree-shaped it will be. But OO does give us
anything helpful with sets. It can fake them, but does not add to the
programmer's ability to manage the sets.

You guys just want to see trees, and will shoehorn things into trees or
near-trees regardless of the mess you create. Then again, maybe your
eyes know how to navigate class spaghetti. Just don't subject everyone
else to class spaghetti.

>
> Consider, as another example, an ATM machine. The basic unit can do
> standard things like deposit, withdrawal, and transfer. However, the
> manufacturer would like to sell new features to their users, such as
> pay gas bill, pay electric bill, pay credit card, etc. If the
> software developers use polymorphism to dispatch to the features, then
> the manufacture can sell new features simply by selling new DLLs or
> jar files, without having to alter the source code of the existing
> system, nor change the binary code of the ATM machines already in use.

I have done similar stuff with control tables. For a simplified
example, consider a menu of ATM options. If a new option is added to
the menu, the control table can contain the function and/or module name
of the new operation being added. One only has to ship the new table
and add new modules. The menu driver stays the same because it just
executes the menu table items.

Of course a lot of this is language-specific. The file-centric
interpreters and compilers are sometimes limited because they are bound
to hierarchical file systems and the files are at the module level
instead of the subroutine level. You can argue that *existing* tools
favor OO. In other words, OO perhaps works better with C-like legacy
languages and file systems, but that is not the fault of the paradigms
themselves but of limited implementations. But I have worked with
compilers that have the option of function-level additions, so they do
(or did) exist. Thus, it is not an untested concept. Dynamic languages
usually have the ability to execute code from files.

Take a look at the middle example ("relational weenie") at the c2 wiki:

http://www.c2.com/cgi/wiki?DoubleDispatchExample

Looking at the world through C-colored glasses, I can see your point
though. I like to believe that my superior intellect allows me to see
beyond such (so what if I am delusional rather than superior, let me
have my fake world please :-)

>
> >> >But I would really like to see poly in business-modeling issues, not
> >> >storage and EXE distribution issues.
> >>
> >> Then look at the way FitNesse handles everything else. Or look at the
> >> payroll example in the PPP book. (The library likely has a copy. If
> >> not, a local Barnes and Noble will have one.)
> >
> >What are the hierarchies used? Please don't tell me it uses "account
> >types". I addressed that in a sister message.
>
> The word "Hierarchy" is a bit misleading. Although there is a lot of
> polymorphism used in the example, there are very few class hierarchies
> that have more than two levels. Typically the top level is an
> interface, and the next level down completely implements that
> interface. There are some exceptions to this rule, especially in some
> of the transaction types.
>
> Basically the structure looks like this:
>
> |Payroll|----->|Employee|
> * |
> +--------->|PayClass|
> |
> +--------->|PaySchedule|
> |
> +--------->|PayDelivery|
>
> Each of the classes on the right is an interface, for which there are
> several derivatives (shown below).
>
> The algorithm in the Payroll class is (partially):
>
> void Payroll::pay() {
> foreach Employee e {
> if (e.isPayDay(today)) {
> Money pay = e.calculatePay();
> e.deliverPay(pay);
> }
> }
> }
>
> The Employee class looks something like this:
>
> class Employee {
> private PayClass payClass;
> private PaySchedule paySchedule;
> private PayDelivery payDelivery;
>
> public bool isPayDay(Date date) {
> return paySchedule.isPayDay(date);
> }
>
> public Money calculatePay() {
> return payClassification.calculatePay();
> }
>
> public void deliverPay(Money pay) {
> payDelivery.deliverPay(pay);
> }
> }
>
> There are three derivative of PayClass:
>
> |PayClass|
> A
> |
> +----------------+------------------+
> | | |
> |HourlyPayClass| |SalariedPayClass| |CommissionedPayClass|
> | |
> |* |*
> V V
> |TimeCard| |SalesReciept|
>
> The 'calculatePay' method of PayClass is polymorphic. It is
> implemented in HourlyPayClass to sum up the contained time cards and
> calculate overtime.
>
> It is implemented in SalariedPayClass to return the monthly salary.
>
> It is implemented in CommissionedPayClass to sum up the sales
> reciepts, apply a commission rate, and add a base salary.
>
> There are three derivatives of PaySchedule:
> WeeklySchedule, MonthySchedule, BiWeeklySchedule.
>
> The 'isPayDay' method of PaySchedule is polymorphic. It is
> implemented in WeeklySchedule to return true if the argument is a
> Friday. Monthly schedule returns true if the argument is the last
> business day of the month. BiWeekly schedule returns true only for
> every other Friday.
>
> There are three derivatives of PayDelivery:
> MailDelivery, DirectDepositDelivery, and PaymasterDelivery
>
> The 'deliverPay' method of MailDelivery causes the employees paycheck
> to be mailed to him. You can guess what the others do.
>
> This is just one very small part of the payroll application as
> described in the www.objectmentor.com/PPP book. One nice thing about
> this design is that the code inside the payroll class can be placed in
> a binary module (a dll or jar file) and deployed independently from
> the rest of the system. If we have a customer who has nothing but
> hourly employees, we can simply ship the payroll module, and module
> that contains HourlyPayClass. Moreover, users can add new PayClasses,
> Schedules, and Delivery mechanisms without having to alter the source
> code of the payroll application.

This sounds like a Strategy pattern, and there are table-oriented and
procedural strategy patterns also. But in my experience such
independence is less common for "business objects". When the business
rules change such that entirely new pay categories appear, it generally
also affects so much other stuff that whole new software needs to be
shipped anyhow, not just individual modules. Business rules are
inherently a big messy graph with a lot of existing or surprise
connections. "Separation of concerns" is often a pipe dream because the
creators of business rules don't bother to keep things separate. The
price of tea in China *can* affect Bob's little app in Nebraska.

Another way of saying more or less the same thing is that the
interfaces will need changing almost as often as implementation. OO
over-obsesses with implementation changes and under-obsesses with
interface changes. Thus, most examples showing the alleged benefits of
OO tend to focus on issues where only the implementation changes, and
this is misleading.

If your observation of business rules paints a cleaner picture, then it
differs from mine and we will just have to agree to disagree since we
cannot do a mental core dump of our experience for each and the readers
of this (if any left).

>
> >> It's good for many things, not for everything. It's use is
> >> situational, not specific to particular domains. There are situations
> >> within every domain in which polymorphism is both useful and not.
> >> That includes drivers, persistence, telecommunications, business
> >> rules, guis, etc, etc. Wherever dependencies need to be managed
> >> (which is just about everywhere) there is a potential for polymorphism
> >> to facilitate that management.
> >
> >Well, perhaps if the situations where it is useful and not are
> >clearified, we might find our areas of agreement. Is it only helfpul if
> >one can find a good, safe hierarchical classification or multiple
> >implementations of the same thing (device drivers)?
>
> I described this in a different email. In summary, polymorphism is
> useful when you want to be able to add new data structures (like
> PayClass derivatives, PaySchedule derivatives, or PayDelivery
> derivatives) without affecting existing functions (like Payroll:pay).
>
> Switch statements are useful when you want to add new functions
> without affecting existing data structures.
>

So you are agreeing there is a trade-off and that one favors one kind
of change over the other. The issue then is perhaps which kinds of
changes actually happen more often in the real world, which is
something only observation can answer.

In summary there seem to be two primary issues where we differ:

1. You seem to think that sub-type oriented additions are more common
than operational additions, and I question that (except for certain
domains).

2. When the feature difference complexity grows beyond a simple
hierarchy (or list of sub-types), I prefer a RDBMS or structure-centric
way to manage such variations as sets of features, while you feel that
multiple inheritance, composition, and multiple interfaces are the best
way to manage them. I want most of the "noun model" in tables and you
like it in code.

>
> -----
> Robert C. Martin (Uncle Bob) | email: unclebob(a)objectmentor.com

-T-

From: Robert C. Martin on 15 Jun 2005 14:23

On 14 Jun 2005 21:26:30 -0700, "topmind" <topmind(a)technologist.com>
wrote:

>> Classification hiearchies don't need to be used to model everything in
>> OO. Your argument in this case is very weak.
>
>Yes, but OOP is ugly when you diverge from trees, as described in a
>nearby reply. It loses its "innocence" and becomes messier than the
>alternatives when you go beyond trees in OOP.

OO *can* become messy, just as procedural can become messy. But OO
does not necessarily become messy. Mess is something that developers
are there to avoid.

>OO was sold to the IT world using simple, easy-to-navigate hierarchies
>and their close cousin, sub-typing.

Maybe. Maybe not. OO may have been sold for one reason or another,
but that's hardly relevant now. What's relevant now is what OO can be
used for, and how it helps or hinders. Good developers use OO as and
when it is needed, without fear, and without religious conviction.

>But the reality is far uglier. The
>IT world has been victimized by bait-and-switch advertising.

No. Some organizations were victimized by their own religious
beliefs. Others took a more pragmatic view, and have not been
disappointed.

-----
Robert C. Martin (Uncle Bob) | email: unclebob(a)objectmentor.com
Object Mentor Inc. | blog: www.butunclebob.com
The Agile Transition Experts | web: www.objectmentor.com
800-338-6716

"The aim of science is not to open the door to infinite wisdom,
but to set a limit to infinite error."
-- Bertolt Brecht, Life of Galileo

First | Prev | Next | Last
Pages: 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Next: Use Case Point Estimation