From: Joseph M. Newcomer on 25 Mar 2010 00:07 You are correct about the size_t. THere was some trash talk about std::vector not being able to handle more than MAX_INT elements, but as I read std::vector, it uses the STL type size_type, which is 64 bits in the 64-bit compiler. Even the most superficial study of the documentation reveals in under 30 seconds that typedef size_t size_type; is the definition of size_type (look at allocator::size_type) so you are right on target about the failure to use the correct data type for the size; it should be size_type (which is size_t, which is 64 bits long in the 64-bit world!) joe On Wed, 24 Mar 2010 12:08:26 +0100, Oliver Regenfelder <oliver.regenfelder(a)gmx.at> wrote: >Hello, > >Just some 64bit, C++ comments. > >Peter Olcott wrote: >> #include <stdio.h> > >#include <cstdio> > >> #include <stdlib.h> > >#include <cstdlib> > >> #include <vector> >> #include <time.h> > >#include <ctime> > >It is supposed to be C++ after all. > >> #define uint32 unsigned int > >typedef unsigned int uint32; > >> const uint32 repeat = 100; >> const uint32 size = 524288000 / 4; >> std::vector<uint32> Data; >> >> >> >> void Process() { >> clock_t finish; >> clock_t start = clock(); >> double duration; >> uint32 num; >> for (uint32 r = 0; r < repeat; r++) > >> for (uint32 i = 0; i < size; i++) >> num = Data[num]; > >This one is subtly bad. The proper type to index arrays is size_t. >Especially as you will be running this on 64bit machines, and on >windows64 int is still 32 bit as far as I remeber. Therefore make >it a habit of indexing arrays with size_t. Because that array >bigger than 4 GB will come and haunt you sooner or later. > >Also you are thinking about doing it on Windows, *nix, mac. In this >case you can't rely on an int being able to index arrays of any >possible size. > >> int main() { >> printf("Size in bytes--->%d\n", size * 4); > >I would do: > printf("Size in bytes--->%d\n", static_cast<size_t>(size) * 4); > >Otherwise you might be upset when size is above 1G. > >> Data.reserve(size); >> for (int N = 0; N < size; N++) >> Data.push_back(rand() % size); > >Again use size_t for indexing. > >Best regards, > >Oliver Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 25 Mar 2010 00:09 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:ohnlq5924q5tisi5vhkl46k8innq5vt7u0(a)4ax.com... > See below... > On Tue, 23 Mar 2010 09:16:34 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >> >>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in >>message >>news:u8xPnamyKHA.2552(a)TK2MSFTNGP04.phx.gbl... >>> Hector Santos wrote: >>> >>>> PS: I noticed the rand() % size is too short, rand() is >>>> limited to RAND_MAX which is 32K. Change that to: >>>> >>>> (rand()*rand()) % size >>>> >>>> to get random range from 0 to size-1. I think thats >>>> right, maybe Joe can give us a good random generator >>>> here, but the above does seem to provide a practical >>>> decent randomness for this task. >>> >>> Peter, using the above RNG seems to be a better test >>> since >>> it hits a wider spectrum. With the earlier one, it was >>> only hitting ranges upto 32K. >>> >>> I also notice when the 32K RNG was used, a dynamic array >>> was 1 to 6 faster than using std::vector. But when >>> using >>> the above RNG, they were both about the same. That is >>> interesting. >>> >>> -- >>> HLS >> >>I made this adaptation and it slowed down by about 500%, a >>much smaller cache hit ratio. It still scaled up to four >>cores with 1.5 GB each, and four concurrent processes only >>took about 50% more than a single process. >> > **** > If you use multiple threads in a single process, or > multiple processes with a shared > memory segment implemented by a Memory Mapped File, you > will reduce the footprint since > either all threads (in one process) share the same memory > footprint, or the multiple > processes share largely the same memory footprint. Yup. so if its fast enough with multiple processes it will sure be fast enough as multiple threads, wouldn't it be? > **** >>I will probably engineer my new technology to be able to >>handle multiple threads, if all that I have to do is >>implement the heuristics that I mentioned. Since my first >>server will only have a single core, on this server it >>will >>only have a single thread. >> >>I still think that the FIFO queue is a good idea. Now I >>will >>have multiple requests and on multi-core machines multiple >>servers. >> >>What is your best suggestion for how I can implement the >>FIFO queue? >>(1) I want it to be very fast >>(2) I want it to be portable across Unix / Linux / >>Windows, >>and maybe even Mac OS X >>(3) I want it to be as robust and fault tolerant as >>possible. >> >>It may simply provide 32-bit hexadecimal integer names of >>input PNG files. These names roll over to the next >>incremental number. Instead they may be HTTP connection >>numbers. I have to learn about HTTP before I will know >>what >>I am doing here. Since all customers (including free >>trial >>customers) will have accounts with valid email address, I >>can always email the results if the connection is lost. >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 25 Mar 2010 00:10 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:dlnlq5pok9nbsc35uaedbot0m18btno5ti(a)4ax.com... >A multithreaded FIFO queue would make more sense; less >chance of priority inversion > effects. An I/O Completion Port makes a wonderful > multithreaded FIFO queue with > practically no effort! > joe How ? > > On Tue, 23 Mar 2010 14:28:56 -0400, Hector Santos > <sant9442(a)nospam.gmail.com> wrote: > >>Peter Olcott wrote: >> >>> I still think that the FIFO queue is a good idea. Now I >>> will >>> have multiple requests and on multi-core machines >>> multiple >>> servers. >> >> >>IMO, it just that its an odd approach to load balancing. >>You are >>integrating software components, like a web server with an >>multi-thread ready listening server and you are hampering >>it with a >>single thread only FIFO queuing. It introduces other >>design >>considerations. Namely, you will need to consider a store >>and forward >>concept for your request and delayed responses. But if >>your request >>processing is very fast, maybe you don't need to worry >>about it. >> >>In practice the "FIFO" would be at the socket level or >>listening level >>with concepts dealing with load balancing by restricting >>and balancing >>your connection with worker pools or simply letting it to >>wait knowing >>that processing won't be too long. Some servers have >>guidelines for >>waiting limits. For the WEB, I am not recall coming >>across any >>specific guideline other than a practical one per >>implementation. The >>point is you don't want the customers waiting too long - >>but what is >>"too long." >> >>> What is your best suggestion for how I can implement the >>> FIFO queue? >>> (1) I want it to be very fast >>> (2) I want it to be portable across Unix / Linux / >>> Windows, >>> and maybe even Mac OS X >>> (3) I want it to be as robust and fault tolerant as >>> possible. >> >> >>Any good collection class will do as long as you wrap it >>with >>synchronization. Example: >> >> >>typedef struct _tagTSlaveData { >> ... data per request......... >>} TSlaveData; >> >>class CBucket : public std::list<TSlaveData> >>{ >>public: >> CBucket() { InitializeCriticalSection(&cs); } >> ~CBucket() { DeleteCriticalSection(&cs); } >> >> void Add( const TSlaveData &o ) >> { >> EnterCriticalSection(&cs); >> insert(end(), o ); >> LeaveCriticalSection(&cs); >> } >> >> BOOL Fetch(TSlaveData &o) >> { >> EnterCriticalSection(&cs); >> BOOL res = !empty(); >> if (res) { >> o = front(); >> pop_front(); >> } >> LeaveCriticalSection(&cs); >> return res; >> } >>private: >> CRITICAL_SECTION cs; >>} Bucket; > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 25 Mar 2010 00:14 "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:snnlq5lfv1a59n003k5gkk8a6plb41ruj1(a)4ax.com... > See below... > > On Tue, 23 Mar 2010 14:11:28 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>Ah so this is the code that you were suggesting? >>I won't be able to implement multi-threading until volume >>grows out of what a single core processor can accomplish. >>I was simply going to use MySQL for the inter-process >>communication, building and maintaining my FIFO queue. > **** > Well, I can think of worse ways. For example, writing the > data to a floppy disk. Or > punching it to paper tape and asking the user to re-insert > the paper tape. MySQL for > interprocess communication? Get serious! Can you think of any other portable way that this can be done? I would estimate that MySQL would actually keep the FIFO queue resident in RAM cache. > **** >> >>One thing else that you may be unaware of std::vector >>generally beats std::list even for list based algorithms, >>including such things as inserting in the middle of the >>list. The reason for this may be that the expensive memory >>allocation cost is allocated over more elements with a >>std::vector, more than enough to pay for the cost of >>reshuffling a few items. This would probably not work for >>very long lists. Also, maybe there is some sort of >>std::list::reserve(), that would mitigate this cost. > **** > Actually, the vector/list tradeoff has been known since > the late 1970s; look at the LISP > machine papers from that era. Interesting that the old > ideas keep getting rediscovered > over and over again (doesn't anybody pay attention to > history?) > joe > **** >> >>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in >>message >>news:O5O%23XiryKHA.5936(a)TK2MSFTNGP04.phx.gbl... >>> Example usage of the class below, I added an Add() >>> override to make it easier to add elements for the >>> specific TSlaveData fields: >>> >>> #include <windows.h> >>> #include <conio.h> >>> #include <list> >>> #include <string> >>> #include <iostream> >>> >>> using namespace std; >>> >>> const DWORD MAX_JOBS = 10; >>> >>> typedef struct _tagTSlaveData { >>> DWORD jid; // job number >>> char szUser[256]; >>> char szPwd[256]; >>> char szHost[256]; >>> } TSlaveData; >>> >>> class CBucket : public std::list<TSlaveData> >>> { >>> public: >>> CBucket() { InitializeCriticalSection(&cs); } >>> ~CBucket() { DeleteCriticalSection(&cs); } >>> >>> void Add( const TSlaveData &o ) >>> { >>> EnterCriticalSection(&cs); >>> insert(end(), o ); >>> LeaveCriticalSection(&cs); >>> } >>> >>> void Add(const DWORD jid, >>> const char *user, >>> const char *pwd, >>> const char *host) >>> { >>> TSlaveData sd = {0}; >>> sd.jid = jid; >>> strncpy(sd.szUser,user,sizeof(sd.szUser)); >>> strncpy(sd.szPwd,pwd,sizeof(sd.szPwd)); >>> strncpy(sd.szHost,host,sizeof(sd.szHost)); >>> Add(sd); >>> } >>> >>> BOOL Fetch(TSlaveData &o) >>> { >>> EnterCriticalSection(&cs); >>> BOOL res = !empty(); >>> if (res) { >>> o = front(); >>> pop_front(); >>> } >>> LeaveCriticalSection(&cs); >>> return res; >>> } >>> private: >>> CRITICAL_SECTION cs; >>> } Bucket; >>> >>> >>> void FillBucket() >>> { >>> for (int i = 0; i < MAX_JOBS; i++) >>> { >>> Bucket.Add(i,"user","password", "host"); >>> } >>> } >>> >>> //---------------------------------------------------------------- >>> // Main Thread >>> //---------------------------------------------------------------- >>> >>> int main(char argc, char *argv[]) >>> { >>> >>> FillBucket(); >>> printf("Bucket Size: %d\n",Bucket.size()); >>> TSlaveData o = {0}; >>> while (Bucket.Fetch(o)) { >>> printf("%3d | %s\n",o.jid, o.szUser); >>> } >>> return 0; >>> } >>> >>> Your mongoose, OCR thingie, mongoose will Bucket.Add() >>> and >>> each spawned OCR thread will do a Bucket.Fetch(). >>> >>> Do it right, it and ROCKS! >>> >>> -- >>> HLS >>> >>> Hector Santos wrote: >>> >>>> Peter Olcott wrote: >>>> >>>>> I still think that the FIFO queue is a good idea. Now >>>>> I >>>>> will have multiple requests and on multi-core machines >>>>> multiple servers. >>>> >>>> >>>> IMO, it just that its an odd approach to load >>>> balancing. >>>> You are integrating software components, like a web >>>> server with an multi-thread ready listening server and >>>> you are hampering it with a single thread only FIFO >>>> queuing. It introduces other design considerations. >>>> Namely, you will need to consider a store and forward >>>> concept for your request and delayed responses. But if >>>> your request processing is very fast, maybe you don't >>>> need to worry about it. >>>> >>>> In practice the "FIFO" would be at the socket level or >>>> listening level with concepts dealing with load >>>> balancing >>>> by restricting and balancing your connection with >>>> worker >>>> pools or simply letting it to wait knowing that >>>> processing won't be too long. Some servers have >>>> guidelines for waiting limits. For the WEB, I am not >>>> recall coming across any specific guideline other than >>>> a >>>> practical one per implementation. The point is you >>>> don't >>>> want the customers waiting too long - but what is "too >>>> long." >>>> >>>>> What is your best suggestion for how I can implement >>>>> the >>>>> FIFO queue? >>>>> (1) I want it to be very fast >>>>> (2) I want it to be portable across Unix / Linux / >>>>> Windows, and maybe even Mac OS X >>>>> (3) I want it to be as robust and fault tolerant as >>>>> possible. >>>> >>>> >>>> Any good collection class will do as long as you wrap >>>> it >>>> with synchronization. Example: >>>> >>>> >>>> typedef struct _tagTSlaveData { >>>> ... data per request......... >>>> } TSlaveData; >>>> >>>> class CBucket : public std::list<TSlaveData> >>>> { >>>> public: >>>> CBucket() { InitializeCriticalSection(&cs); } >>>> ~CBucket() { DeleteCriticalSection(&cs); } >>>> >>>> void Add( const TSlaveData &o ) >>>> { >>>> EnterCriticalSection(&cs); >>>> insert(end(), o ); >>>> LeaveCriticalSection(&cs); >>>> } >>>> >>>> BOOL Fetch(TSlaveData &o) >>>> { >>>> EnterCriticalSection(&cs); >>>> BOOL res = !empty(); >>>> if (res) { >>>> o = front(); >>>> pop_front(); >>>> } >>>> LeaveCriticalSection(&cs); >>>> return res; >>>> } >>>> private: >>>> CRITICAL_SECTION cs; >>>> } Bucket; >>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> HLS >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on 25 Mar 2010 00:15
"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:snnlq5lfv1a59n003k5gkk8a6plb41ruj1(a)4ax.com... > See below... > > On Tue, 23 Mar 2010 14:11:28 -0500, "Peter Olcott" > <NoSpam(a)OCR4Screen.com> wrote: > >>Ah so this is the code that you were suggesting? >>I won't be able to implement multi-threading until volume >>grows out of what a single core processor can accomplish. >>I was simply going to use MySQL for the inter-process >>communication, building and maintaining my FIFO queue. > **** > Well, I can think of worse ways. For example, writing the > data to a floppy disk. Or > punching it to paper tape and asking the user to re-insert > the paper tape. MySQL for > interprocess communication? Get serious! This was suggested on the comp.programming,threads group Cross Platform FIFO Queue > **** >> >>One thing else that you may be unaware of std::vector >>generally beats std::list even for list based algorithms, >>including such things as inserting in the middle of the >>list. The reason for this may be that the expensive memory >>allocation cost is allocated over more elements with a >>std::vector, more than enough to pay for the cost of >>reshuffling a few items. This would probably not work for >>very long lists. Also, maybe there is some sort of >>std::list::reserve(), that would mitigate this cost. > **** > Actually, the vector/list tradeoff has been known since > the late 1970s; look at the LISP > machine papers from that era. Interesting that the old > ideas keep getting rediscovered > over and over again (doesn't anybody pay attention to > history?) > joe > **** >> >>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in >>message >>news:O5O%23XiryKHA.5936(a)TK2MSFTNGP04.phx.gbl... >>> Example usage of the class below, I added an Add() >>> override to make it easier to add elements for the >>> specific TSlaveData fields: >>> >>> #include <windows.h> >>> #include <conio.h> >>> #include <list> >>> #include <string> >>> #include <iostream> >>> >>> using namespace std; >>> >>> const DWORD MAX_JOBS = 10; >>> >>> typedef struct _tagTSlaveData { >>> DWORD jid; // job number >>> char szUser[256]; >>> char szPwd[256]; >>> char szHost[256]; >>> } TSlaveData; >>> >>> class CBucket : public std::list<TSlaveData> >>> { >>> public: >>> CBucket() { InitializeCriticalSection(&cs); } >>> ~CBucket() { DeleteCriticalSection(&cs); } >>> >>> void Add( const TSlaveData &o ) >>> { >>> EnterCriticalSection(&cs); >>> insert(end(), o ); >>> LeaveCriticalSection(&cs); >>> } >>> >>> void Add(const DWORD jid, >>> const char *user, >>> const char *pwd, >>> const char *host) >>> { >>> TSlaveData sd = {0}; >>> sd.jid = jid; >>> strncpy(sd.szUser,user,sizeof(sd.szUser)); >>> strncpy(sd.szPwd,pwd,sizeof(sd.szPwd)); >>> strncpy(sd.szHost,host,sizeof(sd.szHost)); >>> Add(sd); >>> } >>> >>> BOOL Fetch(TSlaveData &o) >>> { >>> EnterCriticalSection(&cs); >>> BOOL res = !empty(); >>> if (res) { >>> o = front(); >>> pop_front(); >>> } >>> LeaveCriticalSection(&cs); >>> return res; >>> } >>> private: >>> CRITICAL_SECTION cs; >>> } Bucket; >>> >>> >>> void FillBucket() >>> { >>> for (int i = 0; i < MAX_JOBS; i++) >>> { >>> Bucket.Add(i,"user","password", "host"); >>> } >>> } >>> >>> //---------------------------------------------------------------- >>> // Main Thread >>> //---------------------------------------------------------------- >>> >>> int main(char argc, char *argv[]) >>> { >>> >>> FillBucket(); >>> printf("Bucket Size: %d\n",Bucket.size()); >>> TSlaveData o = {0}; >>> while (Bucket.Fetch(o)) { >>> printf("%3d | %s\n",o.jid, o.szUser); >>> } >>> return 0; >>> } >>> >>> Your mongoose, OCR thingie, mongoose will Bucket.Add() >>> and >>> each spawned OCR thread will do a Bucket.Fetch(). >>> >>> Do it right, it and ROCKS! >>> >>> -- >>> HLS >>> >>> Hector Santos wrote: >>> >>>> Peter Olcott wrote: >>>> >>>>> I still think that the FIFO queue is a good idea. Now >>>>> I >>>>> will have multiple requests and on multi-core machines >>>>> multiple servers. >>>> >>>> >>>> IMO, it just that its an odd approach to load >>>> balancing. >>>> You are integrating software components, like a web >>>> server with an multi-thread ready listening server and >>>> you are hampering it with a single thread only FIFO >>>> queuing. It introduces other design considerations. >>>> Namely, you will need to consider a store and forward >>>> concept for your request and delayed responses. But if >>>> your request processing is very fast, maybe you don't >>>> need to worry about it. >>>> >>>> In practice the "FIFO" would be at the socket level or >>>> listening level with concepts dealing with load >>>> balancing >>>> by restricting and balancing your connection with >>>> worker >>>> pools or simply letting it to wait knowing that >>>> processing won't be too long. Some servers have >>>> guidelines for waiting limits. For the WEB, I am not >>>> recall coming across any specific guideline other than >>>> a >>>> practical one per implementation. The point is you >>>> don't >>>> want the customers waiting too long - but what is "too >>>> long." >>>> >>>>> What is your best suggestion for how I can implement >>>>> the >>>>> FIFO queue? >>>>> (1) I want it to be very fast >>>>> (2) I want it to be portable across Unix / Linux / >>>>> Windows, and maybe even Mac OS X >>>>> (3) I want it to be as robust and fault tolerant as >>>>> possible. >>>> >>>> >>>> Any good collection class will do as long as you wrap >>>> it >>>> with synchronization. Example: >>>> >>>> >>>> typedef struct _tagTSlaveData { >>>> ... data per request......... >>>> } TSlaveData; >>>> >>>> class CBucket : public std::list<TSlaveData> >>>> { >>>> public: >>>> CBucket() { InitializeCriticalSection(&cs); } >>>> ~CBucket() { DeleteCriticalSection(&cs); } >>>> >>>> void Add( const TSlaveData &o ) >>>> { >>>> EnterCriticalSection(&cs); >>>> insert(end(), o ); >>>> LeaveCriticalSection(&cs); >>>> } >>>> >>>> BOOL Fetch(TSlaveData &o) >>>> { >>>> EnterCriticalSection(&cs); >>>> BOOL res = !empty(); >>>> if (res) { >>>> o = front(); >>>> pop_front(); >>>> } >>>> LeaveCriticalSection(&cs); >>>> return res; >>>> } >>>> private: >>>> CRITICAL_SECTION cs; >>>> } Bucket; >>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> HLS >> > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm |