Finding out minimal allocation unit [ADA]

From: Stefan Bellon on 5 Apr 2007 13:55

Markus E Leypold wrote:

> And I thought it would exactly be the extra padding the OP was
> interested in?

Yes, basically I'm interested in the padding which is inserted and
wasted and which could be used without resulting in a larger allocated
memory chunk.

So, instead of doing:

type Cell1;

type List is access all Cell1;

type Cell1 is record
Next : List;
Item : Item_Type;
end record;

We would like to do:

type Cell2 is record
Next : List;
Used : Natural;
Items : array (1 .. N) of Item_Type;
end record;

Where N is chosen in a way that N is maximized without resulting in the
whole Cell to require the next larger allocation size.

Of course, there is a problem when Cell1'Size (or Cell1'Object_Size?)
is _exactly_ an allocation size as the Cell2 has the necessary Used
element which takes space and thus forces the next larger allocation
unit. But then, N is at least 2 which makes the whole list not worse
than a list of Cell1.

The whole List/Cell stuff is inside a generic and Item_Type is the type
the generic is instantiated with.

So, the question is, how to accurately find out the allocation size for
Cell1 and Cell2 respectively in order to calculate the best N.

In theory it should be possible to create a list of Cell2 which is no
worse in memory consumption than a list of Cell1, but at present we see
quite some memory penalty and figured out that this is because GNAT
(the compiler we use) allocates larger memory chunks than we calculated
using the provided attributes ('Size, 'Object_Size, 'Component'Size,
....).

The main two problems seem to be that we a) don't know the memory
allocation strategy and b) that GNAT's record/array layout seems to be
different than we assume.

Let's look at some data which is puzzling us:

type Cell is record
Item : Ada.Strings.Unbounded.Unbounded_String;
end record;

type Items is array (1 .. 1)
of Ada.Strings.Unbounded.Unbounded_String;

Then, the following is the result of querying the mentioned attributes:

Cell'Size: 352
Cell'Object_Size: 352
Items'Component_Size: 192

So, when used in an array, an unbounded string only needs 24 bytes, and
when used in an record, it needs 44 bytes. Now, when adding another
unbounded string to a record and then building an array of that:

type CellX is record
Item1, Item2 : Ada.Strings.Unbounded.Unbounded_String;
end record;

type ItemsX is array (1 .. 1) of CellX;

Then the following is the result:

CellX'Size: 544
CellX'Object_Size: 544
ItemsX'Component_Size: 544

Where 544 are 68 bytes = 24 bytes + 44 bytes.

I am really not sure how to explain this ... any help is very welcome!

--
Stefan Bellon

From: Stefan Bellon on 5 Apr 2007 13:58

Martin Krischik wrote:

> First we need to know which platform you are using (compiler/OS).

GNAT on GNU/Linux, Windows and Solaris, where the memory allocation
should work well on all three of them (but using separates or package
Naming in a project file to supply different algorithms for the
allocation strategy is no problem).

> Then: how a storrage pool get's it memory is up to the pool designer.

> In your case you could allocate a large chuck of memory strait from
> the OS and then sub-allocate it using an algorithm optimised for your
> needs.

When using storage pools, can the storage pools be dynamically resized,
or is a storage pool, once created, fixed in its size? The latter would
rule out using storage pools since we do not know in advance how many
data we need to store, just the "Item_Type" is known when instantiating
the data structure, so it's size is known (provided we can trust the
'Size, 'Object_Size, and 'Component_Size attributes).

--
Stefan Bellon

From: Stefan Bellon on 5 Apr 2007 14:02

Robert A Duff wrote:

> There are many ways to do it. In your case, you have a large number
> of small objects. In that case, it would make sense to allocate
> large chunks of memory (possibly using "new"), and then have Allocate
> chop them up into small pieces on request.

This sounds good, provided that

a) we can resize an already existing storage pool to increase the
memory inside it, and

b) we can really get at the size GNAT uses for each of the items (see
my other posting with the unbounded string example).

--
Stefan Bellon

From: Randy Brukardt on 5 Apr 2007 21:31

"Stefan Bellon" <sbellon(a)sbellon.de> wrote in message
news:4ecee3fe5esbellon(a)sbellon.de...
> Robert A Duff wrote:
>
> > There are many ways to do it. In your case, you have a large number
> > of small objects. In that case, it would make sense to allocate
> > large chunks of memory (possibly using "new"), and then have Allocate
> > chop them up into small pieces on request.
>
> This sounds good, provided that
>
> a) we can resize an already existing storage pool to increase the
> memory inside it, and

It's whatever you want it to be! A storage pool is just a container that is
automatically called when you allocate or deallocate memory. There of course
are the predefined ones, but you can provide your own about which you define
*all* of the details.

In your case, I'd probably implement the pool as a linked list of arrays
that hold enough elements for some large size (at least 4K for Windows,
can't say for Linux). And then I'd manage them with a free chain and a pass
through for unusual sizes (see below). You could even set the item size as a
discriminant of the pool type, possibly using the attribute designed for
this purpose ('Max_Size_in_Storage_Elements).

> b) we can really get at the size GNAT uses for each of the items (see
> my other posting with the unbounded string example).

I don't know about that for certain (I don't use GNAT that much), but I'd be
fairly surprised if it wasn't. To some extent it is irrelevant, since the
size to allocate is passed into Allocate, and it is easy enough to have
unanticipated sizes just allocate from the regular pool (i.e. use New).
Indeed, I have a pool whose job was to find the source of dangling pointers,
and all it did was save the pointers and clobber memory on deallocation; the
actual allocation was a pure pass-through to the standard pool (using New
and Unchecked_Deallocation).

I suggest reading up on storage pools in whatever you can find. As Bob said,
this is low-level programming; Ada won't be much help if you screw up -- if
you give non-existent memory to Ada, she will screw up just like those other
language some of us love to hate. It's likely that most of the pools that
have been written are specific to some OS or compiler, which may explain why
there aren't many out there. (Mine are all specific to Janus/Ada; not much
use to you.) But they can be fairly generic (allocating large chunks from
the regular heap), and reasonably well-checked: you don't have to write them
solely with pointer arithmetic.

Randy.

From: Randy Brukardt on 5 Apr 2007 21:40

"Stefan Bellon" <sbellon(a)sbellon.de> wrote in message
news:4ecee347d7sbellon(a)sbellon.de...
> Markus E Leypold wrote:
>
> > And I thought it would exactly be the extra padding the OP was
> > interested in?
>
> Yes, basically I'm interested in the padding which is inserted and
> wasted and which could be used without resulting in a larger allocated
> memory chunk.
>
> So, instead of doing:
>
> type Cell1;
>
> type List is access all Cell1;
>
> type Cell1 is record
> Next : List;
> Item : Item_Type;
> end record;
>
> We would like to do:
>
> type Cell2 is record
> Next : List;
> Used : Natural;
> Items : array (1 .. N) of Item_Type;
> end record;
>
> Where N is chosen in a way that N is maximized without resulting in the
> whole Cell to require the next larger allocation size.

Right, but that obscures your algorithm and makes it harder to maintain.
It's better to write a storage pool that does the blocking, and does it
behind the scenes. (Of course, if the overhead of "Next" is contributing to
the problem, you might not have any choice.)

But, honestly, I don't see why you're so concerned about getting the
blocking factor exactly right. Just make it big enough so you're always
allocating at least 256 bytes at a time, and relax. Unless you have a huge
number of these structures, the possible waste of a couple hundred bytes per
list item isn't going to make much difference. And if they're expandable
anyway, you're probably going to be filling them up. Remember, the actual OS
memory allocator on Windows allocates in 64K chunks -- I doubt you want to
use *that* size for blocking!

So, in summary, if the insertion/deletion issues are really significant,
then use a custom storage pool, and don't mess with the top-level algorithm.
And if they're not, just allocate decent sized blocks and don't worry too
much about the waste - it's inevitable unless you write your own storage
pool (tuning as you are trying to do will only work on a single compiler
version/target operating system version; any change to either is likely to
change something - everyone is forever fiddling with their heap
implementations to improve something or other).

Randy.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: accuracy
Next: Ada Hash