From: Peter Olcott on 24 Mar 2010 11:09 "Oliver Regenfelder" <oliver.regenfelder(a)gmx.at> wrote in message news:c9efd$4ba9f330$547743c7$24260(a)news.inode.at... > Hello Peter, > > Peter Olcott wrote: >> If I was going to do it for real it would be one large >> single std::vector of std::vectors shared across multiple >> threads. I really don't want to deal with Virtual Memory >> issues at all. I want to do everything that I reasonably >> can to make them all moot. This may eventually require a >> real-time OS. > > So you are using a std::vector of std::vectors inside your > application? > > Somehting like: > > std::vector<std::vector<int> > > > Best Regards, > > Oliver Yes this part of the new design.
From: Joseph M. Newcomer on 24 Mar 2010 23:55 See below... On Tue, 23 Mar 2010 09:16:34 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >news:u8xPnamyKHA.2552(a)TK2MSFTNGP04.phx.gbl... >> Hector Santos wrote: >> >>> PS: I noticed the rand() % size is too short, rand() is >>> limited to RAND_MAX which is 32K. Change that to: >>> >>> (rand()*rand()) % size >>> >>> to get random range from 0 to size-1. I think thats >>> right, maybe Joe can give us a good random generator >>> here, but the above does seem to provide a practical >>> decent randomness for this task. >> >> Peter, using the above RNG seems to be a better test since >> it hits a wider spectrum. With the earlier one, it was >> only hitting ranges upto 32K. >> >> I also notice when the 32K RNG was used, a dynamic array >> was 1 to 6 faster than using std::vector. But when using >> the above RNG, they were both about the same. That is >> interesting. >> >> -- >> HLS > >I made this adaptation and it slowed down by about 500%, a >much smaller cache hit ratio. It still scaled up to four >cores with 1.5 GB each, and four concurrent processes only >took about 50% more than a single process. > **** If you use multiple threads in a single process, or multiple processes with a shared memory segment implemented by a Memory Mapped File, you will reduce the footprint since either all threads (in one process) share the same memory footprint, or the multiple processes share largely the same memory footprint. **** >I will probably engineer my new technology to be able to >handle multiple threads, if all that I have to do is >implement the heuristics that I mentioned. Since my first >server will only have a single core, on this server it will >only have a single thread. > >I still think that the FIFO queue is a good idea. Now I will >have multiple requests and on multi-core machines multiple >servers. > >What is your best suggestion for how I can implement the >FIFO queue? >(1) I want it to be very fast >(2) I want it to be portable across Unix / Linux / Windows, >and maybe even Mac OS X >(3) I want it to be as robust and fault tolerant as >possible. > >It may simply provide 32-bit hexadecimal integer names of >input PNG files. These names roll over to the next >incremental number. Instead they may be HTTP connection >numbers. I have to learn about HTTP before I will know what >I am doing here. Since all customers (including free trial >customers) will have accounts with valid email address, I >can always email the results if the connection is lost. > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 24 Mar 2010 23:56 A multithreaded FIFO queue would make more sense; less chance of priority inversion effects. An I/O Completion Port makes a wonderful multithreaded FIFO queue with practically no effort! joe On Tue, 23 Mar 2010 14:28:56 -0400, Hector Santos <sant9442(a)nospam.gmail.com> wrote: >Peter Olcott wrote: > >> I still think that the FIFO queue is a good idea. Now I will >> have multiple requests and on multi-core machines multiple >> servers. > > >IMO, it just that its an odd approach to load balancing. You are >integrating software components, like a web server with an >multi-thread ready listening server and you are hampering it with a >single thread only FIFO queuing. It introduces other design >considerations. Namely, you will need to consider a store and forward >concept for your request and delayed responses. But if your request >processing is very fast, maybe you don't need to worry about it. > >In practice the "FIFO" would be at the socket level or listening level >with concepts dealing with load balancing by restricting and balancing >your connection with worker pools or simply letting it to wait knowing >that processing won't be too long. Some servers have guidelines for >waiting limits. For the WEB, I am not recall coming across any >specific guideline other than a practical one per implementation. The >point is you don't want the customers waiting too long - but what is >"too long." > >> What is your best suggestion for how I can implement the >> FIFO queue? >> (1) I want it to be very fast >> (2) I want it to be portable across Unix / Linux / Windows, >> and maybe even Mac OS X >> (3) I want it to be as robust and fault tolerant as >> possible. > > >Any good collection class will do as long as you wrap it with >synchronization. Example: > > >typedef struct _tagTSlaveData { > ... data per request......... >} TSlaveData; > >class CBucket : public std::list<TSlaveData> >{ >public: > CBucket() { InitializeCriticalSection(&cs); } > ~CBucket() { DeleteCriticalSection(&cs); } > > void Add( const TSlaveData &o ) > { > EnterCriticalSection(&cs); > insert(end(), o ); > LeaveCriticalSection(&cs); > } > > BOOL Fetch(TSlaveData &o) > { > EnterCriticalSection(&cs); > BOOL res = !empty(); > if (res) { > o = front(); > pop_front(); > } > LeaveCriticalSection(&cs); > return res; > } >private: > CRITICAL_SECTION cs; >} Bucket; Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 24 Mar 2010 23:59 See below... On Tue, 23 Mar 2010 14:11:28 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: >Ah so this is the code that you were suggesting? >I won't be able to implement multi-threading until volume >grows out of what a single core processor can accomplish. >I was simply going to use MySQL for the inter-process >communication, building and maintaining my FIFO queue. **** Well, I can think of worse ways. For example, writing the data to a floppy disk. Or punching it to paper tape and asking the user to re-insert the paper tape. MySQL for interprocess communication? Get serious! **** > >One thing else that you may be unaware of std::vector >generally beats std::list even for list based algorithms, >including such things as inserting in the middle of the >list. The reason for this may be that the expensive memory >allocation cost is allocated over more elements with a >std::vector, more than enough to pay for the cost of >reshuffling a few items. This would probably not work for >very long lists. Also, maybe there is some sort of >std::list::reserve(), that would mitigate this cost. **** Actually, the vector/list tradeoff has been known since the late 1970s; look at the LISP machine papers from that era. Interesting that the old ideas keep getting rediscovered over and over again (doesn't anybody pay attention to history?) joe **** > >"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >news:O5O%23XiryKHA.5936(a)TK2MSFTNGP04.phx.gbl... >> Example usage of the class below, I added an Add() >> override to make it easier to add elements for the >> specific TSlaveData fields: >> >> #include <windows.h> >> #include <conio.h> >> #include <list> >> #include <string> >> #include <iostream> >> >> using namespace std; >> >> const DWORD MAX_JOBS = 10; >> >> typedef struct _tagTSlaveData { >> DWORD jid; // job number >> char szUser[256]; >> char szPwd[256]; >> char szHost[256]; >> } TSlaveData; >> >> class CBucket : public std::list<TSlaveData> >> { >> public: >> CBucket() { InitializeCriticalSection(&cs); } >> ~CBucket() { DeleteCriticalSection(&cs); } >> >> void Add( const TSlaveData &o ) >> { >> EnterCriticalSection(&cs); >> insert(end(), o ); >> LeaveCriticalSection(&cs); >> } >> >> void Add(const DWORD jid, >> const char *user, >> const char *pwd, >> const char *host) >> { >> TSlaveData sd = {0}; >> sd.jid = jid; >> strncpy(sd.szUser,user,sizeof(sd.szUser)); >> strncpy(sd.szPwd,pwd,sizeof(sd.szPwd)); >> strncpy(sd.szHost,host,sizeof(sd.szHost)); >> Add(sd); >> } >> >> BOOL Fetch(TSlaveData &o) >> { >> EnterCriticalSection(&cs); >> BOOL res = !empty(); >> if (res) { >> o = front(); >> pop_front(); >> } >> LeaveCriticalSection(&cs); >> return res; >> } >> private: >> CRITICAL_SECTION cs; >> } Bucket; >> >> >> void FillBucket() >> { >> for (int i = 0; i < MAX_JOBS; i++) >> { >> Bucket.Add(i,"user","password", "host"); >> } >> } >> >> //---------------------------------------------------------------- >> // Main Thread >> //---------------------------------------------------------------- >> >> int main(char argc, char *argv[]) >> { >> >> FillBucket(); >> printf("Bucket Size: %d\n",Bucket.size()); >> TSlaveData o = {0}; >> while (Bucket.Fetch(o)) { >> printf("%3d | %s\n",o.jid, o.szUser); >> } >> return 0; >> } >> >> Your mongoose, OCR thingie, mongoose will Bucket.Add() and >> each spawned OCR thread will do a Bucket.Fetch(). >> >> Do it right, it and ROCKS! >> >> -- >> HLS >> >> Hector Santos wrote: >> >>> Peter Olcott wrote: >>> >>>> I still think that the FIFO queue is a good idea. Now I >>>> will have multiple requests and on multi-core machines >>>> multiple servers. >>> >>> >>> IMO, it just that its an odd approach to load balancing. >>> You are integrating software components, like a web >>> server with an multi-thread ready listening server and >>> you are hampering it with a single thread only FIFO >>> queuing. It introduces other design considerations. >>> Namely, you will need to consider a store and forward >>> concept for your request and delayed responses. But if >>> your request processing is very fast, maybe you don't >>> need to worry about it. >>> >>> In practice the "FIFO" would be at the socket level or >>> listening level with concepts dealing with load balancing >>> by restricting and balancing your connection with worker >>> pools or simply letting it to wait knowing that >>> processing won't be too long. Some servers have >>> guidelines for waiting limits. For the WEB, I am not >>> recall coming across any specific guideline other than a >>> practical one per implementation. The point is you don't >>> want the customers waiting too long - but what is "too >>> long." >>> >>>> What is your best suggestion for how I can implement the >>>> FIFO queue? >>>> (1) I want it to be very fast >>>> (2) I want it to be portable across Unix / Linux / >>>> Windows, and maybe even Mac OS X >>>> (3) I want it to be as robust and fault tolerant as >>>> possible. >>> >>> >>> Any good collection class will do as long as you wrap it >>> with synchronization. Example: >>> >>> >>> typedef struct _tagTSlaveData { >>> ... data per request......... >>> } TSlaveData; >>> >>> class CBucket : public std::list<TSlaveData> >>> { >>> public: >>> CBucket() { InitializeCriticalSection(&cs); } >>> ~CBucket() { DeleteCriticalSection(&cs); } >>> >>> void Add( const TSlaveData &o ) >>> { >>> EnterCriticalSection(&cs); >>> insert(end(), o ); >>> LeaveCriticalSection(&cs); >>> } >>> >>> BOOL Fetch(TSlaveData &o) >>> { >>> EnterCriticalSection(&cs); >>> BOOL res = !empty(); >>> if (res) { >>> o = front(); >>> pop_front(); >>> } >>> LeaveCriticalSection(&cs); >>> return res; >>> } >>> private: >>> CRITICAL_SECTION cs; >>> } Bucket; >>> >>> >>> >>> >> >> >> >> -- >> HLS > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on 25 Mar 2010 00:00
An IOCP with a multithreaded application will do much better than what you are proposing. joe On Tue, 23 Mar 2010 13:54:30 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote: > >"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >news:epsZzaryKHA.5360(a)TK2MSFTNGP06.phx.gbl... >> Peter Olcott wrote: >> >>> I still think that the FIFO queue is a good idea. Now I >>> will have multiple requests and on multi-core machines >>> multiple servers. >> >> >> IMO, it just that its an odd approach to load balancing. >> You are integrating software components, like a web server >> with an multi-thread ready listening server and you are >> hampering it with a single thread only FIFO queuing. It >> introduces other design considerations. Namely, you will >> need to consider a store and forward concept for your >> request and delayed responses. But if your request >> processing is very fast, maybe you don't need to worry >> about it. >> > >I will probably be implementing the FIFO queue using MySQL. >I want customers to have a strict first in first out, >priority the only thing that will change this is that >multiple servers are now feasible. If it costs me an extra >1/2 % of amortized response rate, then this is OK. > >Because multiple servers are now available, I will drop the >idea of more than one queue priority. I know that database >access is slow, but this will be amortized over much longer >processing time. This solution should be simple and >portable. Also the database probably has a lot of caching >going on so it probably won't even cost me the typical disk >access time of 5 ms. > >> In practice the "FIFO" would be at the socket level or >> listening level with concepts dealing with load balancing >> by restricting and balancing your connection with worker >> pools or simply letting it to wait knowing that processing >> won't be too long. Some servers have guidelines for >> waiting limits. For the WEB, I am not recall coming >> across any specific guideline other than a practical one >> per implementation. The point is you don't want the >> customers waiting too long - but what is "too long." > >This sounds like a more efficient approach, but, it may lack >portability, and it may take longer to get operational. I >will already have some sort of SQL learning curve to >implement my authentication database. This sounds like an >extra leaning curve over and above everything else. There >are only so many 1000 page books that I can read in a finite >amount of time. > >> >>> What is your best suggestion for how I can implement the >>> FIFO queue? >>> (1) I want it to be very fast >>> (2) I want it to be portable across Unix / Linux / >>> Windows, and maybe even Mac OS X >>> (3) I want it to be as robust and fault tolerant as >>> possible. >> >> >> Any good collection class will do as long as you wrap it >> with synchronization. Example: >> >> >> typedef struct _tagTSlaveData { >> ... data per request......... >> } TSlaveData; >> >> class CBucket : public std::list<TSlaveData> >> { >> public: >> CBucket() { InitializeCriticalSection(&cs); } >> ~CBucket() { DeleteCriticalSection(&cs); } >> >> void Add( const TSlaveData &o ) >> { >> EnterCriticalSection(&cs); >> insert(end(), o ); >> LeaveCriticalSection(&cs); >> } >> >> BOOL Fetch(TSlaveData &o) >> { >> EnterCriticalSection(&cs); >> BOOL res = !empty(); >> if (res) { >> o = front(); >> pop_front(); >> } >> LeaveCriticalSection(&cs); >> return res; >> } >> private: >> CRITICAL_SECTION cs; >> } Bucket; >> >> >> >> >> -- >> HLS > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm |