From: bernard on 13 Apr 2010 01:42 Dear experts, I would like to create a custom string class to be used as the most efficient key possible in a map . Hash or tree, I'll have to bench to decide, so in fact I may have to create two custom string classes :) 1�) one tailored for fast compare (less) 2�) one tailored for fast hash and equality. However both will be based on the same principle of trading space for speed following two ideas : 1�) small string optimization : an array of (local_size) data will be stored directly in the object, resorting to dynamic allocation only if needed 2�) storing 8 bits chars (ASCII) in wider types to enable vector operations (As you can guess, I also intend to venture out of C++ standard into stronger alignment specifications to use vector builtin functions , but a first implementation could rely on the compiler generated vectorization of standard C++) Hence the subject : template<unsigned char local_size, typename wide_type> fast_string{...}; While I can't believe that I would be the first to implement such a class, I could not find an example on the net, nor by searching this newsgroup. Would you have some advices/examples wrt to implementation strat�gies ? (0 termination or storing length, union of wide_type local_data[local_size] and wide_type* far_data or always storing the first chars in local data, etc.) I'd also request advices on how best to handle the over stringent (wrt standard C++) requirements of vector instructions if this could be deemed not too off-topic for the list. Best regards, Bernard -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Ulrich Eckhardt on 13 Apr 2010 20:20 bernard wrote: > I would like to create a custom string class to be used as the most > efficient key possible in a map . > Hash or tree, I'll have to bench to decide, so in fact I may have to > create two custom string classes :) > 1°) one tailored for fast compare (less) > 2°) one tailored for fast hash and equality. > > However both will be based on the same principle of trading space for > speed following two ideas : > > 1°) small string optimization : an array of (local_size) data will be > stored directly in the object, resorting to dynamic allocation only if > needed Some implementations of std::string and std::wstring already do that, like e.g. STLport. > 2°) storing 8 bits chars (ASCII) in wider types to enable vector > operations (As you can guess, I also intend to venture out of C++ > standard into stronger alignment specifications to use vector > builtin functions , but a first implementation could rely on the > compiler generated vectorization of standard C++) ASCII is even just 7 bits. Anyhow, if you guarantee alignment suitable for e.g. 32 bits and also that the allocated size is always terminated by a zero 32 bit word, simple comparisons for equality can operate on such chunks of four 8 bit chars. Hashing can probably benefit from this approach, too. > Would you have some advices/examples wrt to implementation > stratégies ? > (0 termination or storing length, Both probably, you will not want to modify the string on calls to c_str(), provided you even want to supply a way to get at a zero-terminated version. Otherwise, you could simply sacrifice this legacy in favour of speed. However, I wouldn't store the length but rather a past-the-end pointer. > union of wide_type local_data[local_size] and wide_type* far_data > or always storing the first chars in local data, etc.) I'd always store beginning and end pointers and always use a contiguous storage area, i.e. not parts local and parts detached. > I'd also request advices on how best to handle the over stringent (wrt > standard C++) requirements of vector instructions if this could be > deemed not too off-topic for the list. I'd look for example code in computationally heavy math libraries. Also, how to best lay out the data in memory probably depends on which vector instructions you're trying to support. Anyhow, that is the last step probably, the first one is creating a set of benchmarks. Good luck! Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
|
Pages: 1 Prev: [Help]An explicit instantiation of function template Next: type of a lambda expression |