From: Paul Wallich on
RG wrote:
> In article <i3nngm$dk9$1(a)news.eternal-september.org>,
> "Scott L. Burson" <Scott(a)ergy.com> wrote:
>
>> RG wrote:
>>> At the risk of repeating myself, the value of being able
>>> to make changes at run time is implicitly acknowledged in nearly every
>>> other aspect of CL's design. So I still don't understand why adding
>>> that ability to this aspect of the language is resisted so vehemently.
>> "Resisted so vehemently"?
>
> "Resisted as vehemently as it is"?
>
>> Pascal Costanza, 2010-07-30:
>> > I find it hard to imagine that this matters beyond simple examples.
>> > But that may just be a limitation of my imagination...
>>
>> Pascal Costanza, 2010-07-30:
>> > Don't let me stop you. Everybody should do what they think is right.
>>
>> Pascal Costanza, 2010-07-31:
>> > You could use a collection library, while I could continue to use the
>> > low-level substrates...
>>
>> Pascal Costanza, 2010-07-31:
>>> I'm not stopping anybody from creating a good collection framework.
>>> If the result convinces me, I'm happy to use it. What I have seen
>>> so far doesn't convince me, but who am I to predict there can't
>> > be anything better?
>
> Pascal is not the only one expressing an opinion about this. And this
> C.L.L. thread is not the only data I have on this issue.
>
>> I think you're confusing a disagreement about the importance of the
>> undertaking with outright resistance. Also, glancing through the
>> thread, I see arguments made by various people on both sides. Björn
>> Lindberg is the only other person I notice consistently taking the view
>> that this kind of abstraction is not important (maybe I missed some; I
>> didn't reread the whole thread). So it appears that even the position
>> that collection abstractions are (relatively) unimportant is not held by
>> a majority of the thread participants... maybe it's about 50-50.
>
> That's enough to puzzle me. I don't understand the quality metric that
> allows one to like CL on the one hand, with all of its focus on dynamism
> and allowing one to make seamless changes at run-time, and yet not like
> abstract collections.
>
>> It seems that you are frustrated because you can't convince everyone.
>
> I'm not frustrated. I'm puzzled.
>
>> I would further point out that even if you did convince everyone, it
>> would not necessarily accomplish anything. The real work to be done
>> here is the design of the abstraction. The only people who need to be
>> convinced are those who are actually going to do that work.
>
> But people have designed such interfaces. I know of at least two. And
> yet the objections are not directed at the shortcomings of these designs
> but at the idea of abstract collections in general. That's what I don't
> get.

I think, from talking to people over the years and from reading/hearing
lots of conversations like this one, that it's exactly the things some
people like about CL that lead them to dislike the idea of abstract
collections. (And that lead at least some of the same people to resist
the idea of changes to CL in general.)

You've got a language with a huge range of built-in data-arrangement
options, combined with really remarkable tools for rolling your own
specific tools. So when someone comes in and says (in effect), "Stop
using those things, just use the system that I've built for you",
they're striking at one of the core competences of being a CL programmer.

Add to that the fact that every single abstract-collection API in the
universe was/is/will be designed wrong, and you have an ironclad case
for opposing them. (How can I say that so confidently? Because every
design involves decisions about how to implement things and what to
implement, and for almost every decision there's someone who can make a
sensible argument it should have been done differently to address the
specific stuff they want to do. QED.)

Of course, almost every facility in CL was designed wrong as well by
this criterion, but it's the devil we know.

paul
From: George Neuner on
On Sun, 08 Aug 2010 19:44:24 -0700, RG <rNOSPAMon(a)flownet.com> wrote:

>In article <6vju5692s8jli6ukibsurcnn0plofnd02j(a)4ax.com>,
> George Neuner <gneuner2(a)comcast.net> wrote:
>
>> On Sun, 08 Aug 2010 15:23:57 -0700, RG <rNOSPAMon(a)flownet.com> wrote:
>>
>> >The vast majority of the software engineering world would disagree with
>> >you that "building layers of domain specific languages ... is the main
>> >activity of developing good software." The vast majority of software is
>> >built in languages that don't have macros, and where as a result
>> >building domain specific languages is very hard, and as a result is very
>> >rarely done.
>>
>> The vast majority of the software engineering world doesn't understand
>> what programming is about. Any time you construct a set of functions
>> designed to let you address a problem in its own terms, you are
>> creating a domain specific language.
>>
>> This is done all the time in every flavor of programming language.
>> Macros are not required.
>
>I hope I don't have to defend here on C.L.L. the proposition that macros
>are useful, and that a language with macros lets you do useful things
>that a language without macros doesn't let you do, at least not as
>easily.

Absolutely not.

My first point was simply that those software "engineers" who would
disagree that building DSLs is their main activity do so because they
are lacking in formal CS education and really don't understand what it
is that they are doing.

My second point is that the main development activity of building DSLs
takes place in all programming languages, regardless of macros.

George
From: Alessio Stalla on
On 9 Ago, 05:03, "Scott L. Burson" <Sc...(a)ergy.com> wrote:
> Alessio Stalla wrote:
>
> > I believe FSet would be used much more if there were a de facto
> > standard API for sequences that worked reasonably well with both CL
> > sequences and FSet.
>
> Although I support the effort, I'm a little skeptical that API
> incompatibility with CL sequences is the major barrier to the adoption
> of FSet. FSet already goes well out of its way to support as much of
> the existing CL sequence API as possible. Sure, it would be nice if
> users didn't have to shadowing-import my versions of the symbols in that
> API, but I don't see that as so awfully onerous. Perhaps I'm mistaken.

Perhaps it's me who's mistaken. I never actually used FSet, but I
think such a library would be a prime candidate for inclusion in a
"standard library" for CL. And in that case, the easier the
integration with CL is, the better. But as things stand now, I agree
that if one finds the need to use FSet he won't be stopped by the non-
sequence API. What I'm dubious about is how often someone thinks "I
really need FSet to solve this problem!". I suspect that doesn't
happen often and that it would happen more often if FSet was part of
some "batteries included" library for CL and if it integrated well
with sequences.

> > One of my original points was that the sequence
> > API proposed by Christophe Rhodes for SBCL, while great in many
> > aspects, doesn't take into account functional data structures (nor
> > does the CL sequence API, of course).
>
> That is one of the two major philosophical divergences between the CL
> API (and Christophe's) and FSet. And it's a severe one. It's not ever
> going to be possible to just plug an imperative collection in in place
> of a functional one; the code that uses them necessarily has to be
> written differently. The same can be true for the converse, if the
> collection is intentionally being aliased; my opinion, as a matter of
> software design, is that such aliasing is rare anyway and should be
> vanishingly rare, but it does occur.

I agree to a certain point: seamlessly swapping an imperative
collection with a functional one (or vice-versa, perhaps to a lesser
extent) is not going to work. However, having an API which allows to
use sequences functionally, even with terrible performance for some
imperative collections (e.g. vectors) imho would be an improvement.
With all the hype for concurrency that's around, having natively
supported functional collections and parallel map and reduce
operations on them could attract some users ;)

> The other philosophical divergence between the CL approach and the FSet
> approach is that the FSet collections are semantics-heavy: a collection
> knows what kind of thing it is, what operations are applicable to it,
> and what equivalence relation it should use for its elements. By
> contrast, a CL list can be used in many different ways -- as a set, a
> sequence, a bag, an alist, a plist, etc.; the semantics are not supplied
> by the data structure, but rather by the operations that are used on it
> and the arguments (such as :TEST) that are supplied to those operations.
>
> I suppose this second divergence is less difficult to overcome; one
> simply says that if you're calling UNION on two FSet sets, it's an error
> to supply a :TEST argument. Indeed, FSet signals an error in that case.

Why must it be an error? I'd prefer to have it work with an
arbitrary :test, albeit more slowly, presumably - or isn't it
possible?

> But it's something a unified API spec will have to deal with.
> Presumably, other collection implementations, be they imperative or
> functional, will also mostly be semantics-heavy in this sense, so this
> isn't just for benefit of FSet.

I don't quite get the point for semantics-heavy collections in
general, as a design principle. I understand that some collections
have to be semantics-heavy - for example hash-based data structures
inherently depend on a given hashing function and a given equality
relation - but in general, what's the benefit of encoding in the
collection how it's going to be used and prematurely forbidding other
uses?

> I did once start out to study Christophe's proposal, with the intention
> of preparing another that would support FSet, but I'm stretched very
> thin these days and didn't get far. I certainly applaud you or anyone
> else trying to do that.
>
> (While you're at it, better integration of CL hash tables wouldn't be a
> bad thing. Granted, hash tables can't meaningfully be concatenated or
> even unioned(*), but at least ELT and (SETF ELT) could work on them.
> (*) Come to think of it, maybe there is a possible meaning for the union
> of two hash tables, if they're assumed to represent sets, i.e. only the
> keys are significant.)

For hashes, and dictionaries in general, one would need another API.
ELT & friends are specified to work on objects of type SEQUENCE. In
any case, I wouldn't use ELT for accessing dictionaries.

> As for what is the largest barrier to adoption of FSet, I don't really
> know but I would be surprised if it's an ease-of-use problem. It is a
> bit unfamiliar, I know, but I put a lot of effort into the design to
> make it as easy to use as I could, and wrote a tutorial besides.
>
> I think that the largest barrier is simply that people don't see the
> need for it; they're happy with what they have. There are lots of
> different kinds of programs and of programming in this world, and I
> freely admit FSet is not appropriate for all of them. I think it would
> be of some value for most kinds of programs, but perhaps it really
> becomes massively useful only for the kind of work I do: static analysis
> and automated reasoning.

That may be true. However, a significant part of the hype around
Clojure is precisely about its functional collections.

> And the second largest barrier, I suspect, is simply that people expect
> it to perform poorly. A lot of people on this list talk about
> minimizing garbage generation in their code -- an art that I think
> belongs in the same category as writing in assembler: experts
> occasionally need to do it, and it's useful to know how to do it, but
> most people most of the time don't need to worry about it. And yet it
> gets discussed here with some regularity. I'm not sure people really
> realize how good modern generational GCs are.

I don't think so; at least, I don't think like that. I understand that
functional collections are necessarily bound to perform worse than
imperative ones, but I don't think that's going to matter in general.

> That said, it has occurred to me that it wouldn't be so awfully
> difficult, at least in some cases, to integrate imperative collections
> with FSet's functional collections in such a way as to give most of the
> benefits of both. For instance, if you have an imperative set
> implemented as a red-black tree, I could easily write a high-performance
> converter from that form into an FSet wb-set. Only slightly harder
> would be a general high-performance set constructor interface that would
> make it easy for you to write the converter. (It would simply need to
> know the size of the set at the beginning, and then be supplied elements
> in sorted order without duplicates.)
>
> Given such a converter, you can construct the set imperatively, then
> convert it to FSet form for further use. This will benefit the case,
> which I believe to be very common, in which a set is not modified very
> much after its initial construction.
>
> There is also the possibility that FSet itself can provide a set of such
> imperative collection types, which are designed to use a sufficiently
> similar internal structure that the conversion is even faster -- and in
> certain circumstances can be elided entirely. I am looking into this.

afaik, the Clojure folks have implemented something like that - the
possibility of building a collection imperatively in performance-
critical code, and successively use it functionally.

Alessio

From: RG on
In article <9mptyn4ghlo.fsf(a)runa.se>, bjorn(a)runa.se (Björn Lindberg)
wrote:

> RG <rNOSPAMon(a)flownet.com> writes:
>
> > I don't think anyone ever claimed that collection abstractions were
> > going to solve every software engineering problem or bring about world
> > peace. But it still seems to me that the benefits are, at least
> > potentially, substantial even if they do not accrue "as often as one
> > might think" (however often that might be) while the cost is essentially
> > zero.
>
> But the cost is far from zero. In rising order of importance, there are
> at least three costs associated with the generic collection abstraction:
> the cost of developing the abstraction, the cost of increased code
> complexity and the cost of the abstraction itself.
>
> The first cost can be avoided if someone else already developed the
> collection abstraction and is offering it for free; otherwise it is a
> very tangible investment.
>
> The second one has to do with the fact that anything that adds to the
> complexity of your code base or adds dependencies to it incurs a cost
> related to debugging and maintenance.
>
> Finally, any generic collection abstraction will by necessity have to
> conform to some least common denominator between the collections
> involved, thus reducing expressivity and preventing the use of specifics
> of any one collection. For example, the typical access patterns are
> different for vectors and lists, and have different performance
> characteristics. A tree can be traversed in several different orders. An
> alist or a plist can cheaply and non-destructively be extended or have
> entries shadowed by rebinding, something which is not possible to do
> with a hash table. All these are examples of properties which can not be
> represented in a generic manner, and would break the abstraction if they
> were.
>
> Generic collection abstractions, like most programming devices, do incur
> costs which have to be outweighed by benefits. Finding the particular
> trade off is always a matter local to the problem at hand, not a general
> one.

Thanks for that explanation. I still don't agree with you, but at least
now I can see your side of it.

> What I initially took issue with was the idea that one should use
> generic collections everywhere preventively, so that a later decision to
> change a data type would not result in massive refactoring. I think this
> mistaken as general advice, since most data types won't change, yet the
> costs mentioned above will arise. Instead, I outlined a different
> approach which solves the refactoring problem in the cases where it
> appears, yet incurs much smaller costs in general.

But your approach has costs too: the cost of designing, implementing,
and maintaining the domain-specific abstractions, to say nothing of the
risk of serious problems down the road if you have to do a large
refactor on a live system. (Do you use Reddit? Have you noticed how
much down-time they've been having lately? Do you think that they may
be wishing now that they had a design that scaled more easily than
theirs does?)

In any particular instance those costs are almost certainly smaller than
the cost of designing and maintaining a generic abstraction, but the
cost of building and maintaining domain-specific abstractions is an
ongoing cost whereas the cost of developing and maintaining a generic
abstraction is a one-time cost (and at this point in history, arguably a
sunk cost) which then at least potentially pays dividends across a large
number of projects.

(I note in passing that it was in the dim and distant past not uncommon
to hear the same form of argument used to oppose high level languages in
general: developing and maintaining a HLL is expensive. The design of
the HLL must cater to some sort of least common denominator. Etc.)

rg
From: =?iso-8859-1?Q?Bj=F6rn?= Lindberg on
RG <rNOSPAMon(a)flownet.com> writes:

> In article <9mptyn4ghlo.fsf(a)runa.se>, bjorn(a)runa.se (Bj�rn Lindberg)
> wrote:

>> What I initially took issue with was the idea that one should use
>> generic collections everywhere preventively, so that a later decision to
>> change a data type would not result in massive refactoring. I think this
>> mistaken as general advice, since most data types won't change, yet the
>> costs mentioned above will arise. Instead, I outlined a different
>> approach which solves the refactoring problem in the cases where it
>> appears, yet incurs much smaller costs in general.
>
> But your approach has costs too: the cost of designing, implementing,
> and maintaining the domain-specific abstractions,

No matter how generic collection abstractions I had access to, I still
would encapsulate important data structures in domain-specific
abstractions as a matter of software engineering. This cost, if it can
be counted as one, would be incurred in either case.

> to say nothing of the risk of serious problems down the road if you
> have to do a large refactor on a live system.

I have repeatedly explained how to address this just as well as with a
generic framework. Avoiding large refactorings in this sense is about
encapsulating relevant aspects so as not to proliferate a particular
design choice throughout the code. Deciding on what is relevant may be
difficult, but can be done at least as well in the application itself as
in a generic library that knows nothing about your program.

> In any particular instance those costs are almost certainly smaller
> than the cost of designing and maintaining a generic abstraction, but
> the cost of building and maintaining domain-specific abstractions is
> an ongoing cost whereas the cost of developing and maintaining a
> generic abstraction is a one-time cost (and at this point in history,
> arguably a sunk cost) which then at least potentially pays dividends
> across a large number of projects.

In the abstract(!) this is a good idea. The potential problem
specifically with collections is the enforced least common denominator
interface that will make a lot of important aspects of them unavailable
through the generic interface. Especially those aspects that cause you
to prefer one collection over another.

As far as code reuse go, it is certainly interesting to be able to reuse
implementations of algorithms and data structures. Reusing code just to
get to have to conform to some particular interface, which may or may
not be well suited to the application at hand, is less appealing. A
generic collection library that actually implemented collections and
operations upon them could be interesting for that reason. A library
whose only purpose were to serve as an abstraction layer for the built
in collections much less so. Such a layer is comparatively cheap to
construct in the application itself, with the benefit that it can be
adapted to the task at hand.

> (I note in passing that it was in the dim and distant past not uncommon
> to hear the same form of argument used to oppose high level languages in
> general: developing and maintaining a HLL is expensive. The design of
> the HLL must cater to some sort of least common denominator. Etc.)

I disagree that these are the same. I would rather note that one common
fallacy in computer science is the one of "false generalization", of
considering things to be the same which are not.


Bj�rn Lindberg