API Design [General Programming]

Prev: Water plant design | Design Build | Water plant construction
Next: Two programs with same logic

From: Ian Collins on 20 Dec 2009 22:40

jacob navia wrote:
> Nobody a écrit :
>>
>> 3. Implement some form of garbage collection.
>>
>
> The lcc-win compiler system provides a grabage collector in its standard
> distribution.
>
> I have been arguing this solution for years but not many people here
> listen. It is the best
> solution: keep the money for the cake AND eat it!

But it isn't practical on the majority of platforms C is used on these days.

--
Ian Collins

From: BGB / cr88192 on 21 Dec 2009 02:13

"Stefan Ram" <ram(a)zedat.fu-berlin.de> wrote in message
news:strdup-20091220183003(a)ram.dialup.fu-berlin.de...
> "BGB / cr88192" <cr88192(a)hotmail.com> writes:
>>it makes it really easy to just grab strings and forget,
>>creating memory leaks with every call.
>
> For a program with a small run time and memory usage, this
> might be appropriate. For a program with a large or
> indeterminate run time, it might show that either the wrong
> programmer was chosen or the wrong programming language.
>

well, the big issue is mostly one of convinience:
it is a lot more convinient to simply forget about strings than to worry
about freeing them;
similarly, since strings tend to be fairly small, very often the code can
leak pretty badly and still keep running just fine (since the app will
almost invariably be exited and restarted before the leak becomes too much
of a problem).

but, this is the issue:
the convinience may prompt bad style...

hence, interning strings is at least a little better, because it allows a
similar convinience while generally bounding memory use (only as much memory
will be use as there are unique strings in the working set, which in
practice is typically much smaller than the available memory).

consider, for example, if a string were interned for every word in a mass of
english text documents. once a finite limit is reached (say, maybe 2500 or
3000 unique words), then this inflation will drop to to almost nothing
(usually periodic random strings, ...).

whereas naive use of strdup will be unbounded (dependent on the total amount
of text processed, rather than the upper bound on the number of unique words
present).

this subtle difference makes a notable difference for "medium length"
running times, especially for higher-activity apps (where a plain memory
leak could kill the app in a matter of minutes, but a more gradual leak may
allow it to last for hours or days or more before crashing...).

granted, it is all far from perfect, but I guess a lot depends on how much
one needs to expect from the code...

for example, what is fine for a command-line tool may not be good enough for
an interactive app, and what is good enough for an interactive app may still
not be good enough if reliability matters. but, then, OTOH, not all apps
need reliability, and for many things it is good enough if the thing only
runs at most a few hours or days at a time...

or, for a command line tool, it may only matter so long as it can process
whatever data it is given.

or (satire), one can make use of newer 64-bit systems as a means for being
even more lazy about dealing with memory leaks...

From: BGB / cr88192 on 21 Dec 2009 02:29

"Stefan Ram" <ram(a)zedat.fu-berlin.de> wrote in message
news:memory-management-20091221000204(a)ram.dialup.fu-berlin.de...
> Nobody <nobody(a)nowhere.com> writes:
>>The situation is different for a long-lived process (i.e. interactive
>>application, daemon, etc), where even a small leak can eventually grow to
>>dominate memory consumption (although you also have to worry about heap
>>fragmentation in that situation).
>
> Or for code in a library, when it is unknown, which kind of
> process will use it. Since the subject of this thread is
> �API Design�, this might be relevant here.
>

granted.

> Some of you will already know the following quotations,
> because I have already posted them into Usenet before.
>
> �There were two versions of it, one in Lisp and one in
> C++. The display subsystem of the Lisp version was faster.
> There were various reasons, but an important one was GC:
> the C++ code copied a lot of buffers because they got
> passed around in fairly complex ways, so it could be quite
> difficult to know when one could be deallocated. To avoid
> that problem, the C++ programmers just copied. The Lisp
> was GCed, so the Lisp programmers never had to worry about
> it; they just passed the buffers around, which reduced
> both memory use and CPU cycles spent copying.�
>

<snip>

>
> (OK, then garbage collection will take time in addition to
> the allocation calls, but when a program only runs for a
> short time, garbage collection might never be needed.)
>

generally agreed.

hence, for many pieces of code, I personally use garbage collection...

the edge case is, however, in C, since the GC is not a "standard" component,
and one may wish to maintain modularity in many cases, not all code may be
able to make use of the GC.

the fallback then is to make use of alternative strategies, such as
allocating data in a region of memory which is periodically either destroyed
or reset, ...

so, it is all so many tradeoffs I guess...

From: BGB / cr88192 on 21 Dec 2009 02:42

"Stefan Ram" <ram(a)zedat.fu-berlin.de> wrote in message
news:strdup-20091219235136(a)ram.dialup.fu-berlin.de...
> Chris McDonald <chris(a)csse.uwa.edu.au> writes:
>>Not really wishing to steal the thread, but I've often wondered why
>>strdup() has not appeared in the C standard. Could a reason be because
>>it *does* return allocated memory?
>
> ISO/IEC 9899:1999 (E) has functions that return allocated
> memory, viz., calloc, malloc, and realloc. But it does
> not /mix/ them with functions that do something else.
> It tries to supply low level building blocks and leave
> the mixing to higher levels.
>
> A function that works on a client buffer can work with
> all kinds of storage (allocated, automatic, and static).
> So it's more �policy-free�.
>
> strdup merely combines malloc and strcpy, so when needed
> it can be easily written. It also adds a little bit of
> a policy.
>
> What I wrote for myself and what also comes in handy
> sometimes, is an sprintf-like function that allocates
> a sufficient buffer and then prints to that buffer.
> This is a generalization of strdup.
>

yep...

I can note that both my codebase, and apparently Quake2, ... have functions
along these lines.

granted, my function (of this sort) typically allocate the memory in a
"rotating allocator" (IOW, a region where the allocator just wraps around
and eventually overwrites whatever was there previously), but what exactly
one needs best depends on their use case...

rotators can also be used as a very specialized GC strategy in some cases
(for example, stale objects are eventually freed unless either freshened or
moved out of the rotator).

....

From: jacob navia on 21 Dec 2009 03:51

Ian Collins a écrit :
>
> FACT: The majority of new C development is for embedded applications

All new development done for the linux kernel and the thousands
of C systems in linux/Mac/Windows do not count.

Your "Facts" aren't facts but assumptions of Mr Collins.

And, by the way, if you are developing for an embedded processor, it is
probably better to use a GC and waste developper's time and risk bugs
with manual GC (malloc/free). Today's embedded processors are huge machines
by yesterday's standards. Only if you have real time requirements
GC could be a problem. Analog Devices processors and DSP for example
are all 32 bits now.

Only in 16 bits processors with a few kbytes of RAM you could be right.

And so what?

You *can* use a GC in many OTHER applications.