From: Jonathan de Boyne Pollard on
>
>
> On my machine (also with 2 GB of RAM) I didn't see any pronounced slow
> down with increased memory use (at least up to the point were still
> enough RAM was available and swapping to the harddisk didn't start), I
> got about 23 seconds with 'size' set to 100000000 and 23.9 s for
> 400000000. So getting more than 3/4 of the avaiable RAM (1.6 GB)
> doesn't seem to be a problem. And /proc/meminfo showed that nearly 1.8
> GB were free (even though an Apache web server, a PostgreSQL server,
> emacs and a number of other processes where also running). That may
> indicate that the effect you're seeing has some reason not directly
> related to memory usage, especially not to the "OS eating up 1.5 GB".
>
The point doesn't seem to have sunk in, yet. M. Olcott is still going
on about memory speeds, too. So I'll repeat it: Despite the fact that
the output of the program claims to be measuring a memory access
throughput, the actual operation of the program itself is doing no such
thing.

One might think, at first blush, that the program is constructing a
large array of pages and then referencing them all randomly, to simulate
a random memory access pattern by some other program. But that's not
what the program is doing. What it is in fact doing is creating a set
of 1 or more random-length potentially cross-linked singly-linked lists
within an array of unsigned integers, and traversing the one of them
that starts at element 0 of the array. As pointed out, many of the step
effects here are due to the way that the pseudo-random sequence happens
to distribute over the array size, which (note) is a non-prime number in
many of the tests that people are running. Furthermore, it's well
within the bounds of possibility for the PRNG to cause a short
singly-linked list that loops tightly within a single page in one case,
and a long singly-linked list that spans multiple pages in another,
depending from seed values and array sizes. There's certainly no
guarantee that all of the array (and thus all of the memory pages it
occupies) is actually referenced at all by the list traversal, so the
calculation of the value that is printed out, based upon the assumption
that the whole array is referenced, is completely bogus.

M. Olcott's program is buggy. It's not actually calculating the metric
that it is purporting to print out. Speculating as to why your machine's
"memory speed" is different to M. Olcott's, based upon the output of
this program, is pointless. You might as well be speculating over NABOB
compression factors.

From: Peter Olcott on

"Jonathan de Boyne Pollard"
<J.deBoynePollard-newsgroups(a)NTLWorld.COM> wrote in message
news:IU.D20100327.T212118.P2407.Q0(a)J.de.Boyne.Pollard.localhost...
> >
>>
>> On my machine (also with 2 GB of RAM) I didn't see any
>> pronounced slow down with increased memory use (at least
>> up to the point were still enough RAM was available and
>> swapping to the harddisk didn't start), I got about 23
>> seconds with 'size' set to 100000000 and 23.9 s for
>> 400000000. So getting more than 3/4 of the avaiable RAM
>> (1.6 GB) doesn't seem to be a problem. And /proc/meminfo
>> showed that nearly 1.8 GB were free (even though an
>> Apache web server, a PostgreSQL server, emacs and a
>> number of other processes where also running). That may
>> indicate that the effect you're seeing has some reason
>> not directly related to memory usage, especially not to
>> the "OS eating up 1.5 GB".
>>
> The point doesn't seem to have sunk in, yet. M. Olcott is
> still going on about memory speeds, too. So I'll repeat
> it: Despite the fact that the output of the program
> claims to be measuring a memory access throughput, the
> actual operation of the program itself is doing no such
> thing.
>
> One might think, at first blush, that the program is
> constructing a large array of pages and then referencing
> them all randomly, to simulate a random memory access
> pattern by some other program. But that's not what the
> program is doing. What it is in fact doing is creating a
> set of 1 or more random-length potentially cross-linked
> singly-linked lists within an array of unsigned integers,
> and traversing the one of them

(1) I implemented a std::vector once so I know that it is
supposed to be allocated a single contiguous block of
memory.
(2) The system monitor indicates that this is what it is
doing because memory allocation increases exactly as
specified.
Where do you get the idea that it is creating a set of
linked lists from?

> that starts at element 0 of the array. As pointed out,
> many of the step effects here are due to the way that the
> pseudo-random sequence happens to distribute over the
> array size, which (note) is a non-prime number in many of
> the tests that people are running. Furthermore, it's well
> within the bounds of possibility for the PRNG to cause a
> short singly-linked list that loops tightly within a
> single page in one case, and a long singly-linked list
> that spans multiple pages in another, depending from seed
> values and array sizes. There's certainly no guarantee
> that all of the array (and thus all of the memory pages it
> occupies) is actually referenced at all by the list
> traversal, so the calculation of the value that is printed
> out, based upon the assumption that the whole array is
> referenced, is completely bogus.
>
> M. Olcott's program is buggy. It's not actually
> calculating the metric that it is purporting to print out.
> Speculating as to why your machine's "memory speed" is
> different to M. Olcott's, based upon the output of this
> program, is pointless. You might as well be speculating
> over NABOB compression factors.
>


From: Peter Olcott on

"Jonathan de Boyne Pollard"
<J.deBoynePollard-newsgroups(a)NTLWorld.COM> wrote in message
news:IU.D20100327.T212118.P2407.Q0(a)J.de.Boyne.Pollard.localhost...
> >
>>
>> On my machine (also with 2 GB of RAM) I didn't see any
>> pronounced slow down with increased memory use (at least
>> up to the point were still enough RAM was available and
>> swapping to the harddisk didn't start), I got about 23
>> seconds with 'size' set to 100000000 and 23.9 s for
>> 400000000. So getting more than 3/4 of the avaiable RAM
>> (1.6 GB) doesn't seem to be a problem. And /proc/meminfo
>> showed that nearly 1.8 GB were free (even though an
>> Apache web server, a PostgreSQL server, emacs and a
>> number of other processes where also running). That may
>> indicate that the effect you're seeing has some reason
>> not directly related to memory usage, especially not to
>> the "OS eating up 1.5 GB".
>>
> The point doesn't seem to have sunk in, yet. M. Olcott is
> still going on about memory speeds, too. So I'll repeat
> it: Despite the fact that the output of the program
> claims to be measuring a memory access throughput, the
> actual operation of the program itself is doing no such
> thing.
>
> One might think, at first blush, that the program is
> constructing a large array of pages and then referencing
> them all randomly, to simulate a random memory access
> pattern by some other program. But that's not what the
> program is doing. What it is in fact doing is creating a
> set of 1 or more random-length potentially cross-linked
> singly-linked lists within an array of unsigned integers,
> and traversing the one of them that starts at element 0 of
> the array. As pointed out, many of the step effects here
> are due to the way that the pseudo-random sequence happens
> to distribute over the array size, which (note) is a
> non-prime number in many of the tests that people are
> running. Furthermore, it's well within the bounds of
> possibility for the PRNG to cause a short singly-linked
> list that loops tightly within a single page in one case,
> and a long singly-linked list that spans multiple pages in
> another, depending from seed values and array sizes.
> There's certainly no guarantee that all of the array (and
> thus all of the memory pages it occupies) is actually
> referenced at all by the list traversal, so the
> calculation of the value that is printed out, based upon
> the assumption that the whole array is referenced, is
> completely bogus.

It is not purporting to do that. It is purporting to
generate a sequence of memory access that in at least some
instances (depending of the sequence of generated random
numbers) at least approximately demonstrates the worst case
behavior for cache hit ratios. So in other words you can
ignore all of those little tight loops that you refer to,
they are not representative.

This is only a little quick and dirty program that has its
sole purpose of enabling ballpark estimates of the
performance of a deterministic finite automaton generator
that has similar memory access patterns.

The only remaining issue is to try to find the reasoning why
these worst case cache hit ratio instances are substantially
slower than what the actual worst case cache hit ratio is
supposed to be.

>
> M. Olcott's program is buggy. It's not actually
> calculating the metric that it is purporting to print out.
> Speculating as to why your machine's "memory speed" is
> different to M. Olcott's, based upon the output of this
> program, is pointless. You might as well be speculating
> over NABOB compression factors.
>


From: Jens Thoms Toerring on
In comp.os.linux.development.apps Peter Olcott <NoSpam(a)ocr4screen.com> wrote:

> "Jonathan de Boyne Pollard"
> <J.deBoynePollard-newsgroups(a)NTLWorld.COM> wrote in message
> > One might think, at first blush, that the program is
> > constructing a large array of pages and then referencing
> > them all randomly, to simulate a random memory access
> > pattern by some other program. But that's not what the
> > program is doing. What it is in fact doing is creating a
> > set of 1 or more random-length potentially cross-linked
> > singly-linked lists within an array of unsigned integers,
> > and traversing the one of them

> Where do you get the idea that it is creating a set of
> linked lists from?

With

for ( uint32 N = 0; N < Max; N++ )
num = Data[ num ];

you use the array like a singly-linked list (with no data pay-
load), i.e. you use the value stored in one vector element as
a "pointer" to another one (well, actually offsets instead of
pointers, but that doesn't make too much of a difference). As
others have pointed out, the way you set up that "list" you
may accidentally create a short loop in this "list" instead of
going through large parts of its elements which, of course,
would make a lot of a difference speed-wise. And the final
behaviour could depend strongly on the exact combination of
the random seeds initial value and the value of 'size'. Thus
some people voiced their concerns about the validity of your
stated results (or the conclusions drawn from them;-)

Regards, Jens
--
\ Jens Thoms Toerring ___ jt(a)toerring.de
\__________________________ http://toerring.de