From: Jonathan de Boyne Pollard on 15 Mar 2010 22:21 > > > Much to our surprise, we discovered that most embedded malloc()/free() > operations are horrifically slow [and more buggy]. > That doesn't surprise me. It's a special case of the more general principle being reiterated by several people elsewhere in this thread: Rolling one's own allocator is tricky to get right. The C/POSIX libraries for embedded systems sometimes haven't had as much work done on them as the C/POSIX libraries for "mainstream" (for want of a better word) systems. They are, in effect, roll-your-own efforts done by a library developer for the target platform. This is not to say that the "mainstream" implementatations are invariably better and bug-free. However they are, by their very natures, more extensively tested as general-purpose allocators. Although implementors do write unit tests, the applications softwares using the library are in general still the most effective test corpus. I've encountered bugs in specialist implementations that have existed for years simply because none of the applications softwares ever exercised the library in the particular way necessary to exhibit the bug.
From: Ersek, Laszlo on 15 Mar 2010 23:52 In article <P_WdnXGWZdnFRgPWnZ2dnUVZ_vWdnZ2d(a)posted.sasktel>, Chris Friesen <cbf123(a)mail.usask.ca> writes: > Would this work portably? It's more overhead so it would likely only > makes sense if MAP_ANONYMOUS isn't supported. > > 1) open a scratch file > 2) unlink it > 3) ftruncate it to the desired size > 4) mmap it > 5) close it I guess this could work, provided we generate system-wide non-clashing filenames for scratch files, create the file with O_RDONLY | O_CREAT | O_EXCL and access permission bits 0, block SIGINT and SIGTERM until after unlinking the file, and mmap() the file with read-write protection and MAP_PRIVATE (non)sharing. Superficially reading up on thread cancellation, I think the cancellability of the thread executing the steps listed above should be set to PTHREAD_CANCEL_DISABLE, at least until after step 2. A cancellation point may occur in unlink(). ... I think thread cancellation could really open a can of worms, so let's ignore it. Perhaps let's replace steps 1 and 2 with shm_open() and shm_unlink(), if the Realtime XSI Option Group is supported as well (or at least the SHM POSIX Option). > [snip] Thank you, lacos
From: Rainer Weikusat on 16 Mar 2010 02:40 Jonathan de Boyne Pollard <J.deBoynePollard-newsgroups(a)NTLWorld.COM> writes: [...] >> Much to our surprise, we discovered that most embedded >> malloc()/free() operations are horrifically slow [and more buggy]. >> > That doesn't surprise me. It's a special case of the more general > principle being reiterated by several people elsewhere in this thread: > Rolling one's own allocator is tricky to get right. As soons as this gets more general than 'implementing malloc such that it survives a lot of common benchmarks comfortably' it is wrong.
From: Rainer Weikusat on 16 Mar 2010 03:01 sfuerst <svfuerst(a)gmail.com> writes: > On Mar 15, 7:36�am, Rainer Weikusat <rweiku...(a)mssgmbh.com> wrote: >> sfuerst <svfue...(a)gmail.com> writes: >> >> [...] >> >> > If you have a wide size-range, then you should know that writing a >> > general purpose allocator isn't a 200 line job. �To get something >> > better than the above allocators will probably require a few >> > thousand lines or more. >> >> This should have been "implementing the malloc interface such that the >> implementation performs reasonably well for the usual benchmark >> cases isn't a 200 line job". [...] >> But it is really much better to not try to do >> this to begin with. And for this case, a 'general purpose allocator' >> which basically avoids external fragmentation and whose allocation and >> deallocation operations are guaranteed to be fast and complete in >> constant time can be implemented in less than 200 lines of code[*], [...] > You might want to make your 200 line userspace allocator - there is > nothing preventing you from doing it. A simple power-of-two allocator > is a nice place to start. There is an obvious mismatch between my statement, the core of which is reproduced above, and this introduction (and your way of reusing my statement about the 'kmalloc' Linux allocator).
From: Jonathan de Boyne Pollard on 20 Mar 2010 03:59
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> <title></title> </head> <body bgcolor="#ffffff" text="#000000"> <blockquote cite="mid:87zl29wdxz.fsf(a)fever.mssgmbh.com" type="cite"> <blockquote type="cite"> <p wrap="">However, I wouldn't want all the dozens of apps on my desktop to all refuse to return memory back to the system just because they might want it again at some point in the future (thus unnecessarily forcing swapping to disk or a memory upgrade). Unless there is a very specific reason to think that performance is critical, my personal view is that it's polite for a userspace app to return memory back to the underlying system whenever it is reasonably possible.<br> </p> </blockquote> <p wrap=""><code>malloc</code> may or may not return memory to the system. Usually, it won't, except in fringe cases (eg 'large allocations' done via <code>mmap</code>). Memory allocations which originally happened by calling <code>brk</code>/<code>sbrk</code> cannot easily be returned to the system, anyway, only if freeing them happens to release a chunk of memory just below the current break.<br> </p> </blockquote> <p>On the contrary, usually it will. I'm revising my estimate of the quality of your "50 — 300 lines of code" implementation downwards as a result of this statement, because you are erroneously conflating allocating address space with committing pages. Most implementations that I am familiar with were written by people that didn't make this mistake. <br> </p> <p>My implementation (more correctly, one of my implementations (-:) calls the OS/2 <code>DosSetMem()</code> function to commit partially used pages and de-commit wholly unused pages within the heap arena as necessary. Several Win32 implementations that I'm aware of call <code>VirtualAlloc()</code> to commit and de-commit pages within arenas. (For a good explanation of this process, see Matt Pietrek's dissection of the DOS-Windows 9x <code>HeapAlloc()</code> function in his <i>Windows 95 System Programming Secrets</i> book.) The GNU C library version 2.11.1 calls <code>madvise()</code> with the <code>MADV_DONTNEED</code> flag for wholly unused pages. <br> </p> <p>The OS/2 and Win32 implementations are returning unused heap memory to the operating system as a matter of course. The GNU C library is intending to do the same, and is doing the best that it can with the more limited system API that it has to work with, and the operating system bugs that it has to cope with. (See, for example, <a href="http://bugzilla.kernel.org./show_bug.cgi?id=6282">Linux kernel bug #6282</a>, reported by Samuel Thibault in 2007.)<br> </p> </body> </html> |