API Design [General Programming]

Prev: Water plant design | Design Build | Water plant construction
Next: Two programs with same logic

From: BGB / cr88192 on 20 Dec 2009 12:13

"Malcolm McLean" <regniztar(a)btinternet.com> wrote in message
news:COOdnUq8IKa8Y7DWnZ2dnUVZ8qqdnZ2d(a)bt.com...
>
> "Chris McDonald" <chris(a)csse.uwa.edu.au> wrote in message news:
>> jt(a)toerring.de (Jens Thoms Toerring) writes:
>>
>> Not really wishing to steal the thread, but I've often wondered why
>> strdup() has not appeared in the C standard. Could a reason be because
>> it *does* return allocated memory?
>>
> That's right. It would make the string library dependent upon the dynamic
> memory allocation library. Just for one trivial little function, it wasn't
> considered worth it.
>

and, oddly, I have a lot of special purpose 'strdup' functions which tend
instead to intern the strings...

I think it is mostly that 'strdup', as is such, is too convinient for what
it does.
it makes it really easy to just grab strings and forget, creating memory
leaks with every call.

I tend instead to intern the strings (for various API's, which happen to
include strdup-like functions), so that I can just as well use it as a
trivial "make sure string will not disappear" function.

all is not ideal though, since many of these APIs don't otherwise check or
GC these strings, and hence the memory is not reclaimed (and, hence, these
functions end up "not to be used" for one-off strings or buffers, ...).

granted, absent buffers, typically the memory creep is likely to be small
enough to be ignorable (a string table somewhere which gradually gets larger
with old dead strings never to be seen again).

usage cases for things like this is usually in my dynamic linker or compiler
code, usually for dealing with symbol names (variable names, function names,
....), ...

other, more frontend API calls, similarly "merge" strings, but instead use
the GC and a weak hash to dispose of them.

I guess this is mostly part of my own "unique" mindset, as I tend to see
most strings as atomic units on which operations are performed. thinking of
strings as character buffers is a little odd to me, as I tend to treat them
somewhat differently (a character buffer is for doing work, but a 'string'
is an immutable atomic unit).

or such...

>
>

From: Malcolm McLean on 20 Dec 2009 16:40

"Stefan Ram" <ram(a)zedat.fu-berlin.de> wrote in message news:
> "BGB / cr88192" <cr88192(a)hotmail.com> writes:
>>it makes it really easy to just grab strings and forget,
>>creating memory leaks with every call.
>
> For a program with a small run time and memory usage, this
> might be appropriate. For a program with a large or
> indeterminate run time, it might show that either the wrong
> programmer was chosen or the wrong programming language.
>
To me, any memory leak is a big red flag. Not so much for technical reasons,
but because it suggests that the programmer has lost control of his logic.

From: Nobody on 20 Dec 2009 17:34

On Sun, 20 Dec 2009 21:40:13 +0000, Malcolm McLean wrote:

>> For a program with a small run time and memory usage, this
>> might be appropriate. For a program with a large or
>> indeterminate run time, it might show that either the wrong
>> programmer was chosen or the wrong programming language.
>>
> To me, any memory leak is a big red flag. Not so much for technical reasons,
> but because it suggests that the programmer has lost control of his logic.

For typical Unix "commands", it's often not worth tracking memory
allocations. Everything will be freed when the process terminates.

Managing memory is easy when the pointer returned from a function is
*always* allocated by that particular call. But if you consider the case
where the object may already exist as a "shared" value (e.g. interned
strings), you can either:

1. Duplicate the object and require the caller to free() it. This is
wasteful if callers often retain the value for the duration of the process.

2. Return an indication of whether the caller should free it. This
information then has to be passed around with the pointer, complicating
the code.

3. Implement some form of garbage collection.

4. Don't worry about it. Sometimes this will result in more memory being
used than is strictly necessary.

Sometimes, #4 is the rational choice. Particularly, if the allocations are
likely to constitute a fixed (and relatively small) overhead per process,
or can otherwise never amount to more than a small proportion of the
process' total memory consumption.

The situation is different for a long-lived process (i.e. interactive
application, daemon, etc), where even a small leak can eventually grow to
dominate memory consumption (although you also have to worry about heap
fragmentation in that situation).

From: jacob navia on 20 Dec 2009 18:41

Nobody a écrit :
>
> 3. Implement some form of garbage collection.
>

The lcc-win compiler system provides a grabage collector in its standard distribution.

I have been arguing this solution for years but not many people here listen. It is the best
solution: keep the money for the cake AND eat it!

From: jacob navia on 20 Dec 2009 18:42

Stefan Ram a �crit :
> Nobody <nobody(a)nowhere.com> writes:
>> The situation is different for a long-lived process (i.e. interactive
>> application, daemon, etc), where even a small leak can eventually grow to
>> dominate memory consumption (although you also have to worry about heap
>> fragmentation in that situation).
>
> Or for code in a library, when it is unknown, which kind of
> process will use it. Since the subject of this thread is
> �API Design�, this might be relevant here.
>
> Some of you will already know the following quotations,
> because I have already posted them into Usenet before.
>
> �There were two versions of it, one in Lisp and one in
> C++. The display subsystem of the Lisp version was faster.
> There were various reasons, but an important one was GC:
> the C++ code copied a lot of buffers because they got
> passed around in fairly complex ways, so it could be quite
> difficult to know when one could be deallocated. To avoid
> that problem, the C++ programmers just copied. The Lisp
> was GCed, so the Lisp programmers never had to worry about
> it; they just passed the buffers around, which reduced
> both memory use and CPU cycles spent copying.�
>

The lcc-win compiler system provides a garbage collector in its standard
distribution.

> <XNOkd.7720$zx1.5584(a)newssvr13.news.prodigy.com>
>
> �A lot of us thought in the 1990s that the big battle would
> be between procedural and object oriented programming, and
> we thought that object oriented programming would provide
> a big boost in programmer productivity. I thought that,
> too. Some people still think that. It turns out we were
> wrong. Object oriented programming is handy dandy, but
> it's not really the productivity booster that was
> promised. The real significant productivity advance we've
> had in programming has been from languages which manage
> memory for you automatically.�
>

Exactly.

> http://www.joelonsoftware.com/articles/APIWar.html
>
> �[A]llocation in modern JVMs is far faster than the best
> performing malloc implementations. The common code path
> for new Object() in HotSpot 1.4.2 and later is
> approximately 10 machine instructions (data provided by
> Sun; see Resources), whereas the best performing malloc
> implementations in C require on average between 60 and 100
> instructions per call (Detlefs, et. al.; see Resources).
> And allocation performance is not a trivial component of
> overall performance -- benchmarks show that many
> real-world C and C++ programs, such as Perl and
> Ghostscript, spend 20 to 30 percent of their total
> execution time in malloc and free -- far more than the
> allocation and garbage collection overhead of a healthy
> Java application (Zorn; see Resources).�
>
> http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html?ca=dgr-jw22JavaUrbanLegends
>
> (OK, then garbage collection will take time in addition to
> the allocation calls, but when a program only runs for a
> short time, garbage collection might never be needed.)
>

Exactly