Finding memory leak in Solaris using libumem [Solaris]

Prev: Anyone install wireshark from Sunfreeware?
Next: Host power on is disabled due to:SCC is not valid or is not present

From: viswanath vellaiappan on 18 Dec 2008 01:20

Hello All,
I was given a project in my college and the description of
the project is as follow.

Goal:
The idea of this whole thing is to find memory leaks in specially
in the applications user land. Since configd constitutes big chunk of
user land and it is a daemon where impact of memory leak is quite huge
(unlike commands where whatever unfreed memory will be freed when
program exits). So, to begin with cover configd and pay attention to
commands.

Approach:
1. Get familiarity on libumem.so on Solaris.
2. Machine : On Solaris machine, preferrably Solaris 10 which has
minimum 15 disks.
3. Write a wrapper for configd in /sbin/, this wrapper configd
essentially sets LD_LIBRARY_PATH and run /usr/sbin/configd from the
wrapper. This way we dont have to change any tcs.
4. Whenever there is memory leak detected, configd core dump, so make
sure to use coreadm to save the core file in unique file name.
5. Before running the nightly tcs, you need to have a script which
runs every 10 secs or so see if configd is down, if yes, is there a
core dump and then restart the configd.
6. Run nightly tcs and collect all core dumps.
7. Analyse the core dumps and come up with fixes for these.

FYI, configd is a binary in the application which will keep track of
the configuration of the whole application.

My Doubts:
Point 3.

* Why should we write the wrapper in /sbin? Why not in
other place?
* So does it mean we should create a /sbin/configd
file with content

export
LD_LIBRARY_PATH= "/usr/lib/libumem.so.1"
/usr/
sbin/configd $1 $2 $3 ...

* How does test cases play a role here?

Point 4.
* How to find memory leak? (is it with ::findleaks
dcmds in mdb)
* How is the core dump formed? (Manually by gcore or
automatically)

Point 5.
* How is this done? (separate thread of program?)

Can you please elaborate about the whole project.
Thanks in advance,
Viswanath Vellaiappan.

From: Giorgos Keramidas on 18 Dec 2008 17:02

On Wed, 17 Dec 2008 22:20:14 -0800 (PST), viswanath vellaiappan <viswanath.vellaiappan(a)gmail.com> wrote:
> Hello All,
> I was given a project in my college and the description of
> the project is as follow.
>
> Goal:
> The idea of this whole thing is to find memory leaks in specially
> in the applications user land. Since configd constitutes big chunk of
> user land and it is a daemon where impact of memory leak is quite huge
> (unlike commands where whatever unfreed memory will be freed when
> program exits). So, to begin with cover configd and pay attention to
> commands.
>
> Approach:
> 1. Get familiarity on libumem.so on Solaris.

There are excellent guides for libumem at http://docs.sun.com

> 2. Machine : On Solaris machine, preferrably Solaris 10 which has
> minimum 15 disks.

The number of disks isn't relevant to `libumem.so.1' usage. It looks
like a project-specific constraint.

> 3. Write a wrapper for configd in /sbin/, this wrapper configd
> essentially sets LD_LIBRARY_PATH and run /usr/sbin/configd from the
> wrapper. This way we don't have to change any tcs.

I don't know what ``tcs'' means in this context, but I think the main
idea behind changing `/sbin/configd' in `/sbin' is to make sure that
libumem is preloaded even if you manually run:

# /sbin/configd

or if a system startup script tries to run `/sbin/configd'.

For a one off troubleshooting session, this is probably ok. If someone
suggests replacing `/sbin' tools with custom shell wrappers in critical
or _production_ systems, though, I would be *very* worried. Whenever
hacks like this are in place, applying future Solaris patches may fail,
or a patch may silently overwrite the local hack & break all assumptions
about what the locally hacked `/sbin/configd' does.

> 4. Whenever there is memory leak detected, configd core dump, so make
> sure to use `coreadm' to save the core file in unique file name.

Manually tracking leaks with MDB and libumem.so.1 works fine if you set
a breakpoint at _exit(). Then libumem memory tracking information is
still available when you interactively type commands at an MDB prompt.

I am not sure if this works equally well with programs that *never*
terminate though (eg. a daemon that runs forever in the background).

> 5. Before running the nightly tcs, you need to have a script which
> runs every 10 secs or so see if configd is down, if yes, is there a
> core dump and then restart the configd.
> 6. Run nightly tcs and collect all core dumps.
> 7. Analyse the core dumps and come up with fixes for these.

> * Why should we write the wrapper in /sbin? Why not in
> other place?
> * So does it mean we should create a /sbin/configd
> file with content
>
> export
> LD_LIBRARY_PATH= "/usr/lib/libumem.so.1"
> /usr/
> sbin/configd $1 $2 $3 ...

No. The `sbin/configd' path i a relative pathname. Relative to the
runtime working directory of the program that invokes the wrapper script
of `/sbin/configd'.

What would run if you logged in as root and typed the following?

# cd /var/tmp
# /sbin/configd

Another nit is that LD_LIBRARY_PATH is the wrong linker environment
option to set. You probably want LD_PRELOAD there.

More importantly, if you intend to save a wrapper at `/sbin/configd' it
is a profoundly bad idea to recursively call `/sbin/configd' at the end
of that wrapper script. This will keep trying to run the wrapper,
within the wrapper, and a new wrapper within the wrapper of the wrapper,
ad infinitum.

You are likely to get better results by:

#!/bin/sh

LD_PRELOAD="/usr/lib/libumem.so.1"
export LD_PRELOAD

/sbin/configd.save "$@"

HTH,
Giorgos

|
Pages: 1
Prev: Anyone install wireshark from Sunfreeware?
Next: Host power on is disabled due to:SCC is not valid or is not present