From: "Bill Brehm" don't want on 15 Mar 2010 11:06 I am working on a program to calculate primes. It's just for fun, not for work or school, if that matters here. I'm trying to optimize it and I'll have some more questions about that later. Today I changed the CArray used to hold the primes to a CList based on a compile time selection, figuring I would save the time of reallocating and copying the CArray every time I went over the maximum allocated size. I'm aware of the pros and cons of CArray and CList in general. A disadvantage is that the CList of primes takes (about) four time as much memory as the CArray holding the same number of primes. The primes are 32 bit ints so I don't know why it's four times instead of three times (previous pointer, prime and next pointer). What else is in there? I tried something that I wasn't sure would work but it did. I serialized a CArray to disk and then later I serialized it from disk to a CList. It worked. It's not a total surprise, because it wouldn't make sense to store the next and previous CList pointers into a serialize archive. My question here is, does anyone know if these were made intentionally compatible and if I can count on that working in all cases? Thanks...
From: Joseph M. Newcomer on 15 Mar 2010 19:41 See below... On Mon, 15 Mar 2010 23:06:24 +0800, "Bill Brehm" <don't want spam> wrote: >I am working on a program to calculate primes. It's just for fun, not for >work or school, if that matters here. I'm trying to optimize it and I'll >have some more questions about that later. > >Today I changed the CArray used to hold the primes to a CList based on a >compile time selection, figuring I would save the time of reallocating and >copying the CArray every time I went over the maximum allocated size. I'm >aware of the pros and cons of CArray and CList in general. **** Your choice of CArray is probably a Bad Idea. First, you should SetSize and set an explicit reallocation quantum (the second parameter of SetSize) to minimize the amount of reallocation that is done. Or use std::vector which does this implicitly, and had far, far better performance than CArray. **** >A disadvantage is >that the CList of primes takes (about) four time as much memory as the >CArray holding the same number of primes. The primes are 32 bit ints so I >don't know why it's four times instead of three times (previous pointer, >prime and next pointer). What else is in there? **** you have made a fundaemental failure in thinking here. You think, for example, that the size of any elmeent matters, which is the wrong way to think about the problem. What matter is the memory footprint, and the induced paging, which is potentially far worse in a CLIst. Note that most prime algorithms are variants of the Seive of Eastosthenes, which means they don't scal to large integer (such as are using for encryption) and you can find enumerations of all the primes in the range of 0..2**32-1 on the Web and in reference books. LIss are nearly always less efficient than arrays due to locality-of-reference issues that affect cache behavior and paging. ***** > >I tried something that I wasn't sure would work but it did. I serialized a >CArray to disk and then later I serialized it from disk to a CList. It >worked. It's not a total surprise, because it wouldn't make sense to store >the next and previous CList pointers into a serialize archive. My question >here is, does anyone know if these were made intentionally compatible and if >I can count on that working in all cases? **** I wouldn't touch MFC serialization as far as I could throw a mainframe. I would not use it for anything I cared about. joe **** > >Thanks... > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Goran on 16 Mar 2010 03:56 On Mar 15, 4:06 pm, "Bill Brehm" <don't want spam> wrote: > I tried something that I wasn't sure would work but it did. I serialized a > CArray to disk and then later I serialized it from disk to a CList. It > worked. It's not a total surprise, because it wouldn't make sense to store > the next and previous CList pointers into a serialize archive. My question > here is, does anyone know if these were made intentionally compatible and if > I can count on that working in all cases? Yes, but I doubt that was intentional. The reason is really this: both CList<> and CArray<> simply serialize the number of elements, then each element, from first to last. If you think about it, what else should they do? So this will work for lists and arrays, and I would guess that you can also altercate between e.g. CArray<BYTE> and CByteArray and such, __EXCEPT__ if you use "shift" operators and serialize pointers to concrete class (that is, if you use ar << pToCByteArray, you will have trouble using ar >> pToTemplateByteArray later on). Goran.
From: "Bill Brehm" don't want on 16 Mar 2010 09:38 "Your choice of CArray is probably a Bad Idea." It IS turning out to be a bad idea. This was one of the things I said I would get around to asking. If I give a growby of 10000, the program runs at a reasonable speed. When I set it much higher, like 100,000,000, thinking that i'll have fewer reallocations, it runs really slowly. I would have expected the opposite until 1e8 primes were calculated, then a large delay during reallocation. But it's slow from the start. I did some PerformanceQueries and found that adding something to the CArray (when it grew by 10000) was ten times slow than doing a division as part of the prime check. I don't know if it's a paging issue or some inefficiency in CArray itself. I guess it's much worse with a large growby but i don't know why and i didn't measure to check. Would the release version be much faster than the debug version (less checking for bounds and other stuff)? Is there a way (with task manager maybe) to tell if the CArray is staying in memory or being swapped out to disk? Is there a way to force the CArray to stay in memory and not swap out? Any guess why it might be so slow? "you have made a fundaemental failure in thinking here" Maybe so, but it turns out that the CList is much faster than the CArray, in my program at least. Too bad it takes four times the memory and I run out before I can reach 2^32. Also, it takes much longer to clear out the CList than the CArray, which is logical since it's made up of many tiny parts instead of a few large parts. I found another issue with CList which I think I would consider a bug. When I Serialize back from disk into a CArray, anything in the array is wiped out and overwritten. When I do that into the CList, the disk data is appended to the end of what is in the list. I didn't see that described in the help but I also didn't find any mention of this "bug" in a quick Google search. "I wouldn't touch MFC serialization as far as I could throw a mainframe." Ha ha, how far can you throw a mainframe? I am learning to agree with you on this. I don't think I ever used it before. I started to use it this time because it is much faster and uses less space than saving and loading in text format. But if it's not trustworthy then it's not worth it. "Note that most prime algorithms are variants of the Seive of Eastosthenes" Yes, I'm using that. My interest in playing with primes was triggered by trying to learn about encryption and modular arithmetic used in RSA. But my first program ever was a prime number generator running on some HP Basic computer back in the early 70's, so it's interesting to me to bring that up to date. My program also can run the old way by dividing by primes until the sqrt of the number under test. I think my original Basic program divided by all odd numbers because there wasn't enough memory to store the array of primes on that old computer. Actually the sieve technique should scale to large integers (if you mean the 64 bit type and not arbitrarily large) once one has an array of primes to 2^32 and if the sieving is done in segments. "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:b0htp5l4t2bi28po0b6gfqbfkctt27up6a(a)4ax.com... > See below... > On Mon, 15 Mar 2010 23:06:24 +0800, "Bill Brehm" <don't want spam> wrote: > >>I am working on a program to calculate primes. It's just for fun, not for >>work or school, if that matters here. I'm trying to optimize it and I'll >>have some more questions about that later. >> >>Today I changed the CArray used to hold the primes to a CList based on a >>compile time selection, figuring I would save the time of reallocating and >>copying the CArray every time I went over the maximum allocated size. I'm >>aware of the pros and cons of CArray and CList in general. > **** > Your choice of CArray is probably a Bad Idea. First, you should SetSize > and set an > explicit reallocation quantum (the second parameter of SetSize) to > minimize the amount of > reallocation that is done. Or use std::vector which does this implicitly, > and had far, > far better performance than CArray. > **** >>A disadvantage is >>that the CList of primes takes (about) four time as much memory as the >>CArray holding the same number of primes. The primes are 32 bit ints so I >>don't know why it's four times instead of three times (previous pointer, >>prime and next pointer). What else is in there? > **** > you have made a fundaemental failure in thinking here. You think, for > example, that the > size of any elmeent matters, which is the wrong way to think about the > problem. What > matter is the memory footprint, and the induced paging, which is > potentially far worse in > a CLIst. Note that most prime algorithms are variants of the Seive of > Eastosthenes, which > means they don't scal to large integer (such as are using for encryption) > and you can find > enumerations of all the primes in the range of 0..2**32-1 on the Web and > in reference > books. LIss are nearly always less efficient than arrays due to > locality-of-reference > issues that affect cache behavior and paging. > ***** >> >>I tried something that I wasn't sure would work but it did. I serialized a >>CArray to disk and then later I serialized it from disk to a CList. It >>worked. It's not a total surprise, because it wouldn't make sense to store >>the next and previous CList pointers into a serialize archive. My question >>here is, does anyone know if these were made intentionally compatible and >>if >>I can count on that working in all cases? > **** > I wouldn't touch MFC serialization as far as I could throw a mainframe. I > would not use > it for anything I cared about. > joe > **** >> >>Thanks... >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Giovanni Dicanio on 16 Mar 2010 12:26 "Bill Brehm" <don't want spam> ha scritto nel messaggio news:e9or$3QxKHA.1796(a)TK2MSFTNGP02.phx.gbl... > If I give a growby of 10000, the program runs at a reasonable speed. When > I set it much higher, like 100,000,000, thinking that i'll have fewer > reallocations, it runs really slowly. I would have expected the opposite > until 1e8 primes were calculated, then a large delay during reallocation. One of the problems of CArray is its reallocation policy. IIRC, CArray uses an arithmetic growth policy, which is not smart. Instead, std::vector uses a geometric growth policy (using a 1.5x factor for increasing capacity, IIRC). In any case, I think that if you set CArray initial size big enough, you won't have speed problems. Did you try something like SetSize(100000000) ? Then you can use a separated integer index to identify the first free slot in the array, to add new found primes. Giovanni
|
Next
|
Last
Pages: 1 2 3 4 5 6 7 Prev: Skinning dialog box using Feature pack Next: System tray menu questions. |