Prev: How to reuse an existing stream with a different streambuf ?
Next: A quick question about Initializer-List in C++0x
From: Andre Kaufmann on 15 Jun 2010 22:55 Peter Olcott wrote: > On 6/15/2010 3:59 AM, Ulrich Eckhardt wrote: > [...] > Although it may be possible for the compiler to inline a function > without inline being requested I am not sure that it typically does this. It does. Andre -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Ulrich Eckhardt on 15 Jun 2010 22:55 Peter Olcott wrote: > On 6/15/2010 4:17 AM, Asher Langton wrote: >> while (...) >> { >> ... >> if (condition) >> doSomething(); >> } >> >> If doSomething() is inlined, then the body of the loop might no longer >> fit in the instruction cache, which -- in the case where 'condition' > > If the body of the loop without the function call overhead does not fit > in cache, then the body of the loop with the function call overhead will > also not fit in cache because it requires even more memory. Adding > function call overhead can not reduce memory requirements. If it is inline, it will probably be loaded into the cache even if it is not called, because the cache always loads contiguous chunks of memory. The only exception is if the compiler aligns the code in a way that skipping the call skips exactly one or more chunks that would be loaded as a cacheline. This is not impossible, but the compiler would probably need to know that "condition" is not very likely, otherwise it would blow up every conditionally executed code to multiples of a cacheline, which requires padding and even more increases memory consumption. If it is not inline, you will only have the (hopefully small) function call code in the cache when it is not called. When called, of course the memory overhead will be even larger. Paying a big price rarely can be better than paying a small price regularly. In any case, my summary of this remains "I don't know" and that I would leave picking the best way to the compiler. Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: jwatts on 16 Jun 2010 02:44 On Jun 15, 3:01 pm, Peter Olcott <NoS...(a)OCR4Screen.com> wrote: > On 6/15/2010 4:17 AM, Asher Langton wrote: > > > > > On Jun 13, 1:43 pm, Peter Olcott<NoS...(a)OCR4Screen.com> wrote: > >> (1) The ONLY reason not to always inline everything (that can be > >> in-lined) all the time is code bloat. > >> (2) If a function is only called once then there can not possibly be any > >> code bloat at all. > >> (3) Therefore all functions that are only called once should always be > >> in-lined. > > > Consider this snippet: > > > while (...) > > { > > ... > > if (condition) > > doSomething(); > > } > > > If doSomething() is inlined, then the body of the loop might no longer > > fit in the instruction cache, which -- in the case where 'condition' > > If the body of the loop without the function call overhead does not fit > in cache, then the body of the loop with the function call overhead will > also not fit in cache because it requires even more memory. Adding > function call overhead can not reduce memory requirements. > Once you have entered the loop, the cost in time of the function call and any memory requirements will have been paid. Since your entire program will not fit in the cache, you will encounter cache flushes in the vicinity of the inlined code anyway. The question devolves to "is the cost of the function call significant?" I would tend to say no since normally the cost of the loop will dwarf that of the function call. As a general rule, I recommend that my developers avoid the temptation of 'premature optimization'. First, just make it work. Then, if performance is not acceptable, use a profiler to find the problem areas. First try simple changes in algorithms, and if that's not sufficient, perhaps structural or architectural changes are necessary. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: jwatts on 16 Jun 2010 02:46 On Jun 8, 4:54 am, Peter Olcott <NoS...(a)OCR4Screen.com> wrote: > I am aiming to produce the best balance of speed and space efficiency > against reliability and maintainability placing a higher weight on the > latter criteria. > http://www.ocr4screen.com/UTF8.h > > If there are any improvements that can be made within the criteria > provided, or other improvements that do not detract from the above > criteria input would be welcomed. > > This support file is required for compiling. Please do not critique the > support file it is not in final form. > http://www.ocr4screen.com/Array2D.h > 1. I would recommend pre-generating the 'states' array so that it becomes a runtime constant; 2. In the constructor UnicodeEncodingConversion(), can you replace the loops used to initialize states[][] with calls to memset()? That will almost certainly be both faster and smaller; 3. In method toUTF32(), is it possible to produce a reasonable estimate of the amount of space required for the result vector UTF32? Each time a call to push_back() results in reallocation of a vector, you will pay a runtime cost, especially once the vectors reach significant size; 4. Be careful when assigning values of small types (i.e. uint8_t) to variables of a larger type (i.e. uint32_t). e.g. "CodePoint = Byte;" Should you ever move the code to a different compiler, the sign- extension behavior may change, causing unexpected, and likely undesired, behaviors; 5. Instead of using a for() loop to traverse the input vector, prefer using iterators; 6. I would have prefered to process each extended character at once rather than process each byte individually. That is, having determined that a byte is the first of three, go ahead and retrieve the next two bytes and process them immediately. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Joshua Maurice on 16 Jun 2010 07:02
On Jun 16, 10:46 am, jwatts <jwatt...(a)gmail.com> wrote: > 3. In method toUTF32(), is it possible to produce a reasonable > estimate of the amount of space required for the result vector UTF32? > Each time a call to push_back() results in reallocation of a vector, > you will pay a runtime cost, especially once the vectors reach > significant size; No. As I'm sure many other people will reply, ::std::vector::push_back is specifically guaranteed to have amortized O(1) runtime. It accomplishes this by not increasing the capacity by just 1 when it runs out of capacity, but instead by increasing the capacity by a multiple when it runs out of capacity. This is generally simplified in discussions to "doubling the capacity each time push_back runs out of capacity". However, I have seen reports from good people that doubling isn't ideal from real world measurement. 2 isn't special; any constant multiplier greater than 1 will give amortized O(1) runtime. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |