From: Leigh Johnston on 31 May 2010 17:07 "Daniel T." <daniel_t(a)earthlink.net> wrote in message news:daniel_t-F52648.16462631052010(a)70-3-168-216.pools.spcsdns.net... > In article <j9OdnTjci7caiJnRnZ2dnUVZ8k-dnZ2d(a)giganews.com>, > "Leigh Johnston" <leigh(a)i42.co.uk> wrote: > >> "Daniel T." <daniel_t(a)earthlink.net> wrote in message >> news:daniel_t-C210FF.15502731052010(a)70-3-168-216.pools.spcsdns.net... >> > "Leigh Johnston" <leigh(a)i42.co.uk> wrote: >> >> "Daniel T." <daniel_t(a)earthlink.net> wrote: >> >> >> >> > CodePoint and UTF32[N] are two representations that both refer to >> >> > the same piece of knowledge. Why the unnecessary duplication? >> >> >> >> It is not unnecessary *if* there is a noticeable performance >> >> improvement. I agree however that premature optimization should be >> >> avoided (obviously) which is why profiling should be performed >> > >> > I'm glad we agree that the code in question is probably an unnecessary >> > optimization. >> >> I didn't say that, it is unclear if the optimization is necessary and >> whether or not it is can be determined through profiling and/or examining >> the compiler's assembler output. > > Fine, but you do agree that it is an optimization, the only doubt you > hold here is whether or not it is necessary. Since no tests have been > presenting showing that code without the extra variable needs > optimizing, this is by definition, a premature optimization. No, using a temporary has no downsides (is at worst harmless) and yet could be beneficial from a performance standpoint which makes it (in my book) a win-win that doesn't deserve your criticism. I write such code quite a lot (use a temporary to avoid multiple dereferences), call me biased if you want. /Leigh
From: Öö Tiib on 31 May 2010 21:32 On 31 mai, 23:46, "Daniel T." <danie...(a)earthlink.net> wrote: > In article <j9OdnTjci7caiJnRnZ2dnUVZ8k-dn...(a)giganews.com>, > "Leigh Johnston" <le...(a)i42.co.uk> wrote: > > "Daniel T." <danie...(a)earthlink.net> wrote in message > >news:daniel_t-C210FF.15502731052010(a)70-3-168-216.pools.spcsdns.net... > > > "Leigh Johnston" <le...(a)i42.co.uk> wrote: > > >> "Daniel T." <danie...(a)earthlink.net> wrote: > > >> > CodePoint and UTF32[N] are two representations that both refer to > > >> > the same piece of knowledge. Why the unnecessary duplication? > > >> It is not unnecessary *if* there is a noticeable performance > > >> improvement. I agree however that premature optimization should be > > >> avoided (obviously) which is why profiling should be performed > > > I'm glad we agree that the code in question is probably an unnecessary > > > optimization. > > I didn't say that, it is unclear if the optimization is necessary and > > whether or not it is can be determined through profiling and/or examining > > the compiler's assembler output. > > Fine, but you do agree that it is an optimization, the only doubt you > hold here is whether or not it is necessary. Since no tests have been > presenting showing that code without the extra variable needs > optimizing, this is by definition, a premature optimization. For me "CodePoint" temporary is easier to read than "UTF32[N]" so in context of original code it is not just an optimization. May be it is because "UTF32[N]" hurts my eyes (feels like "macro array"). Using iterator might make it unnecessary, but such example that uses iterator has not been posted. Isn't it fruitless to argue about effect of such temporary variable to readability or performance of code that no one has written?
From: Joseph M. Newcomer on 31 May 2010 22:16 See below... On Mon, 31 May 2010 20:16:53 +0200, "Giovanni Dicanio" <giovanniDOTdicanio(a)REMOVEMEgmail.com> wrote: >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote: > >>> UTF8.reserve(UTF32.size() * 4); // worst case >> **** >> Note that this will call malloc(), which will involve setting a lock, then >> searching for a >> block to allocate, then releasing the lock. Since you have been a fanatic >> about >> performance, why is it you put a very expensive operation like 'reserve' >> in your code? >> >> While it is perfectly reasonable, it seems inconsistent with your >> previously-stated goals. > >Joe: I'm not sure if you are ironic or something :) ... but I believe that >std::vector::reserve() with a proper capacity value, followed by several >push_back()s, is very efficient. >Sure, not as efficient as a static stack-allocated array, but very >efficient. **** But this code was written by someone who has been beating us nearly insensible about how critical every single instruction is. So the code shown takes more instructions than other alternatives, and he's been telling us that alternative implementations that take an extra instuction or two are unacceptable implementations. So this code is inconsistent with his previous concerns about performance. If there is irony here, it is the fact that he violates his own strongly-stated goals about perfomance, I could not help but point out the inconsistency. ***** > > >> No, the CORRECT way to write such code is to either throw an exception (if >> you are in C++, >> which you clearly are) or return a value indicating the error (for >> example, in C, an > >In this case, I'm for exception. >Thanks to exception, you could use the precious function return value to >actually return the resulting buffer (UTF8 string), instead of passing it as >a reference to the function: **** I'd probably choose to throw an exception, where the exception information included the offset into the input vector, a pointer to the input vector so the handler could decide what to do, etc. **** > > // Updated prototype: > // - use 'const' correctness for utf32 > // - return resulting utf8 > // - may throw on error > std::vector<uint8_t> toUTF8(const std::vector<uint32_t> & utf32); > >Note that thanks to the move semantics (i.e. the new "&&" thing of C++0x, >available in VC10 a.k.a. VS2010), you don't pay for extra useless copies in >returning potentially big objects. **** Yep. But this did not state it was a 2010-compliant version. Ultimately, there is a philosophical inconsistency between his strongly-stated concerns about performance over the last several months, and the actual implmentation presented here. Since he loves picking nits with us, I felt it was only fair to return the favor. This does not change the fact that the printf is without a doubt a really awful interface. joe **** > >Giovanni > > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Oliver Regenfelder on 1 Jun 2010 06:52 Hello, Leigh Johnston wrote: > Also printf sucks, this is a > C++ newsgroup not a C newsgroup. This is not even a general C++ newsgroup but an MFC one. So strictly there is zero relevance of his posting to this newsgroup. Best regards, Oliver
From: Oliver Regenfelder on 1 Jun 2010 07:12
Hello Peter, Peter Olcott wrote: > const correctness requires the "extra" CodePoint variable. That is wrong. And if you would be deep into 'const correctness' you would have declared the method itself const too. Best regards, Oliver |