Prev: const is an overrated concept that is a source of extra typing and maintenance
Next: const is an overrated concept that is a source of extra typing and maintenan
From: gast128 on 2 Apr 2010 07:08 On 2 apr, 13:45, Goran <goran.pu...(a)gmail.com> wrote: > On Mar 31, 6:43 pm, "gast...(a)hotmail.com" <gast...(a)hotmail.com> wrote: > > > Hello all, > > > I am not sure if this is the correct newsgroup, but the information > > might be helpful. > > > The (stdext::) hash_map in combination with strings (used on VStudio > > 2003 / vc7.1) does not have a good performance. It's actual slower > > than std::map<string, ...>. I noticed however that changing the hash > > function (with boost::hash_value for example) does make it fast again. > > Perhaps there is an issue with the hash function? A small test with > > random strings (up to the a length of 2000) gave the following > > results: > > > stdext::hash_value with string: > > Mean: 3.73593e+009 > > Sum: 373592873859453 > > Min: 3735928607 > > Max: 3735928779 > > Var: 919.648 > > > boost::hash_value with string: > > Mean: 2.67717e+009 > > Sum: 267716741970601 > > Min: 1230214564 > > Max: 4217158627 > > Var: 2.11744e+018 > > > It seems that stdext::hash_value has a very narrow distribution, which > > may the root of the bad performance. > > > note: I am aware of the tr1 / boost unordered_map which will superceed > > it. However this still may be an important issue for developpers who > > dropped hash containers due to bad experience with above combination > > (like myself). > > Well, what is your use-case? If it's slower than std::map as it is, > why are you even bothering using a hash map (that is, what, if > anything, do you gain over a lowly map for your use-case with the > boost's hash)? > > Also, a reminder for others who might be reading this: > > 1. any algo is fast for a small value of n (related empirical fact: > optimization is the root of all evil) > 2. hardware likes linear searching due to better locality of > references > 2. think about your data first. Do you want/will want it ordered? If > so, use an ordered container (set or map). > 3. did you verify that hashing really is significantly more performant > than an ordered container? { edits: quoted blank line block, signature, banner and Dutch (?) line removed. please keep readers in mind when you quote. thank you. -mod } I think this is somewhat besides the case I am rying to make. In many books can be found that hash maps often out perform balanced trees. With that knowledge I used the hash_map with strings years ago (in some kind of q&d profiler which needed fast lookup). However I noticed that it had a bad performance, so I dropped the hash_map and used the standard map. Also I was wondering if this orginal claim was correct. Until recently I started to use the boost::unordered_map and noticed its fast perfromance. It even outperformed a sorted vector (with contiguous memory which should be fast). So I was wondering why this hash_map was slow. According to my obsevation now it isn't, only it is negatively affected by its default hasher. Maybe this is also something from Mr. Plauger to react on, but he may have already dropped the hash_map (with its default hasher) in favor of the tr1::unordered_map. And yes everything is dependent on the size of the containers. However bad performance is often caused by large containers and O(n^2) behavior. In typically small container contexts it doesn't make much difference what u use. Often the linear std::find with std::vector has then the best performance. That will change as soon as that container gets heavily loaded. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |