From: Veloz on 2 Mar 2010 11:29 Hi all I'm looking for a queue that I can use with multiprocessing, which has a peek method. I've seen some discussion about queue.peek but don't see anything in the docs about it. Does python have a queue class with peek semantics? Michael
From: Raymond Hettinger on 2 Mar 2010 13:18 On Mar 2, 8:29 am, Veloz <michaelve...(a)gmail.com> wrote: > Hi all > I'm looking for a queue that I can use with multiprocessing, which has > a peek method. > > I've seen some discussion about queue.peek but don't see anything in > the docs about it. > > Does python have a queue class with peek semantics? Am curious about your use case? Why peek at something that could be gone by the time you want to use it. val = q.peek() if something_i_want(val): v2 = q.get() # this could be different than val Wouldn't it be better to just get() the value and return if you don't need it? val = q.peek() if not something_i_want(val): q.put(val) Raymond
From: Veloz on 2 Mar 2010 14:02 On Mar 2, 1:18 pm, Raymond Hettinger <pyt...(a)rcn.com> wrote: > On Mar 2, 8:29 am, Veloz <michaelve...(a)gmail.com> wrote: > > > Hi all > > I'm looking for a queue that I can use with multiprocessing, which has > > a peek method. > > > I've seen some discussion about queue.peek but don't see anything in > > the docs about it. > > > Does python have a queue class with peek semantics? > > Am curious about your use case? Why peek at something > that could be gone by the time you want to use it. > > val = q.peek() > if something_i_want(val): > v2 = q.get() # this could be different than val > > Wouldn't it be better to just get() the value and return if you don't > need it? > > val = q.peek() > if not something_i_want(val): > q.put(val) > > Raymond Yeah, I hear you. Perhaps queue is not the best solution. My highest level use case is this: The user visits a web page (my app is a Pylons app) and requests a "report" be created. The report takes too long to create and display on the spot, so the user expects to visit some url "later" and see if the specific report has completed, and if so, have it returned to them. At a lower level, I'm thinking of using some process workers to create these reports in the background; there'd be a request queue (into which requests for reports would go, each with an ID) and a completion queue, into which the workers would write an entry when a report was created, along with an ID matching the original request. The "peek" parts comes in when the user comes back later to see if their report has done. That is, in my page controller logic, I'd like to look through the complete queue and see if the specific report has been finished (I could tell by matching up the ID of the original request to the ID in the completed queue). If there was an item in the queue matching the ID, it would be removed. It's since occurred to me that perhaps a queue is not the best way to handle the completions. (We're ignoring the file system as a solution for the time being, and focusing on in-memory structures). I'm wondering now if a simple array of completed items wouldn't be better. Of course, all the access to the array would have to be thread/process- proof. As you pointed out, for example, multi-part operations such as "is such-and-such an ID in the list? If so, remove it and return in" would have to be treated atomically to avoid concurrency issues. Any thoughts on this design approach are welcomed :-) Michael
From: MRAB on 2 Mar 2010 14:44 Veloz wrote: > On Mar 2, 1:18 pm, Raymond Hettinger <pyt...(a)rcn.com> wrote: >> On Mar 2, 8:29 am, Veloz <michaelve...(a)gmail.com> wrote: >> >>> Hi all >>> I'm looking for a queue that I can use with multiprocessing, which has >>> a peek method. >>> I've seen some discussion about queue.peek but don't see anything in >>> the docs about it. >>> Does python have a queue class with peek semantics? >> Am curious about your use case? Why peek at something >> that could be gone by the time you want to use it. >> >> val = q.peek() >> if something_i_want(val): >> v2 = q.get() # this could be different than val >> >> Wouldn't it be better to just get() the value and return if you don't >> need it? >> >> val = q.peek() >> if not something_i_want(val): >> q.put(val) >> >> Raymond > > Yeah, I hear you. Perhaps queue is not the best solution. My highest > level use case is this: The user visits a web page (my app is a > Pylons app) and requests a "report" be created. The report takes too > long to create and display on the spot, so the user expects to visit > some url "later" and see if the specific report has completed, and if > so, have it returned to them. > > At a lower level, I'm thinking of using some process workers to create > these reports in the background; there'd be a request queue (into > which requests for reports would go, each with an ID) and a completion > queue, into which the workers would write an entry when a report was > created, along with an ID matching the original request. > > The "peek" parts comes in when the user comes back later to see if > their report has done. That is, in my page controller logic, I'd like > to look through the complete queue and see if the specific report has > been finished (I could tell by matching up the ID of the original > request to the ID in the completed queue). If there was an item in the > queue matching the ID, it would be removed. > > It's since occurred to me that perhaps a queue is not the best way to > handle the completions. (We're ignoring the file system as a solution > for the time being, and focusing on in-memory structures). I'm > wondering now if a simple array of completed items wouldn't be better. > Of course, all the access to the array would have to be thread/process- > proof. As you pointed out, for example, multi-part operations such as > "is such-and-such an ID in the list? If so, remove it and return in" > would have to be treated atomically to avoid concurrency issues. > > Any thoughts on this design approach are welcomed :-) > A set of completed reports, or a dict with the ID as the key? The advantage of a dict is that the value could contain several bits of information, such as when it was completed, the status (OK or failed), etc. You might want to wrap it in a class with locks (mutexes) to ensure it's threadsafe.
From: Martin P. Hellwig on 2 Mar 2010 14:58
On 03/02/10 19:44, MRAB wrote: <cut> > information, such as when it was completed, the status (OK or failed), > etc. You might want to wrap it in a class with locks (mutexes) to ensure > it's threadsafe. What actually happens if multiple threads at the same time, write to a shared dictionary (Not using the same key)? I would think that if the hashing part of the dictionary has some sort of serialization (please forgive me if I misuse a term) it should 'just work'(tm)? -- mph |