From: ldh on 9 Jul 2010 23:07 Hello- Can anyone please clarify for me, does tr1::unordered_map guarantee that iterators remain valid always (except iterators to items that have been removed?). This was definitely the case with the old SGI implementation. I am trying to transition from gcc's __gnu_cxx::hash_map to std::tr1::unordered_map in gcc 4.5, and I am getting crashes that seem like it's not safe to store an iterator to objects and use it later (after many other insertions and erasures of other objects) to delete that object. Before I look deeper into the gcc implementation I just wanted to make sure it's the case, that according to the standard the iterators should remain valid. Thanks! -Lewis -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Pedro Lamarão on 11 Jul 2010 04:47 On 10 jul, 11:07, ldh <lhy...(a)gmail.com> wrote: > Can anyone please clarify for me, does tr1::unordered_map guarantee > that iterators remain valid always (except iterators to items that > have been removed?). This was definitely the case with the old SGI > implementation. I am trying to transition from gcc's > __gnu_cxx::hash_map to std::tr1::unordered_map in gcc 4.5, and I am > getting crashes that seem like it's not safe to store an iterator to > objects and use it later (after many other insertions and erasures of > other objects) to delete that object. Before I look deeper into the > gcc implementation I just wanted to make sure it's the case, that > according to the standard the iterators should remain valid. Thanks! I'm not sure about TR1, but the draft for ISO C++0x dated 2010-03-26 says in section [unord.req], paragraphs 13 and 14: 13. The insert members shall not affect the validity of references to container elements, but may invalidate all iterators to the container. The erase members shall invalidate only iterators and references to the erased elements. 14. The insert members shall not affect the validity of iterators if (N +n) < z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container�s bucket count, and z is the container�s maximum load factor. Note this section is actually about models of Unordered Associative Containers, which includes unordered_map, unordered_set etc. GCC's implementation for TR1 and C++0x is probably the same. -- P. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Marek Vondrak on 11 Jul 2010 18:52 > > Can anyone please clarify for me, does tr1::unordered_map guarantee > > that iterators remain valid always (except iterators to items that > > have been removed?). This was definitely the case with the old SGI > > implementation. I posted exactly the same question two years ago and unfortunatelly did not get any satisfactory answer. Yes, TR1 iterators get invalidated also on inserts, but I do not see why it has to be so. SGI style hash (or the ones from Dinkumware) containers do not have these restrictions and, consequently, provide the same guarantees on iterators like std::map which is really convenient. -Marek -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: ldh on 12 Jul 2010 17:33 On Jul 12, 5:52 am, Marek Vondrak <marek.vond...(a)gmail.com> wrote: > > > Can anyone please clarify for me, does tr1::unordered_map guarantee > > > that iterators remain valid always (except iterators to items that > > > have been removed?). This was definitely the case with the old SGI > > > implementation. > > I posted exactly the same question two years ago and unfortunatelly > did not get any satisfactory answer. Yes, TR1 iterators get > invalidated also on inserts, but I do not see why it has to be so. SGI > style hash (or the ones from Dinkumware) containers do not have these > restrictions and, consequently, provide the same guarantees on > iterators like std::map which is really convenient. OK thanks for the responses. I think I understand the reason, it is much more flexible and perhaps more efficient not to require the iterator to remain always valid; in order to guarantee validity of existing iterators in the presence of rehashing, one must avoid storing a pointer to the element's bucket inside the iterator. The SGI implementation of an iterator just contains a pointer to the node, and a pointer to the hash table; when you increment it and it hits the end of the bucket, it has to re-calculate the hash value and look up the correct bucket so it can find the next one. This makes calling the form of erase() that takes an iterator less efficient than you might think -- you still have to hash the key to find out what bucket you are erasing from, so that you can re-link the other nodes in that bucket correctly, and you still have to iterate through the whole list unless it is doubly linked. But the main reason to force always-valid iterators is presumably to enable calling erase() with that iterator later, so it may end up giving you very little benefit for the price you pay. In any case, it seems clear what the newly standardized behavior will be, so that's what I needed. Thanks. -Lewis -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Marek Vondrak on 13 Jul 2010 11:51
> The SGI > implementation of an iterator just contains a pointer to the node, and > a pointer to the hash table; when you increment it and it hits the end > of the bucket, it has to re-calculate the hash value and look up the > correct bucket so it can find the next one. As I argued last year, it does not have to be done this way. Each iterator only needs to point to a node object. Node objects can be spliced into buckets using an intrusive list and at the same time they can be spliced to a global list of all nodes in the container. At any time instant, the global list equals some concatenation of bucket lists (in some random order). The global list is used to enumerate all objects in the container, bucket lists are used to find ranges of objects with the same key (subsets of bucket lists). Using the global list, one can hop from a bucket to bucket without doing any additional searches or evaluations of node's hash values and using the node's bucket lists, one can efficiently find places to insert new nodes to. I used to have an implementation of hash maps and multimaps that worked this way. The time complexities were as follows: insert: time to find a bucket O( 1 ) + time to find a place within bucket \approx number_of_items_in_container / number_of_buckets (when using a hash function that distributes elements uniformly) erase: O( 1 ), invalidates only iterators and pointers pointing to the deleted object find: time to find a bucket O( 1 ) + time to find an element within bucket \approx number_of_items_in_container / number_of_buckets find in multimap: as above iterator::operator++() and operator--(): O( 1 ) -Marek -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |