[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6. Debugging

This chapter of the manual includes tips that are useful when debugging arla.

Arla and xfs contains logging facilities that is quite useful when debugging when something goes wrong. This and some kernel debugging tips are described.

6.1 Arlad  
6.2 Debugging LWP with GDB  
6.3 xfs  
6.4 xfs on linux  
6.5 Debugging techniques  
6.6 Kernel debuggers  
6.7 Darwin/MacOS X  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1 Arlad

If arlad is run without any arguments arlad will fork(2) and log to syslog(3). To disable forking use the --no-fork (-n) switch. In the current state of the code, arlad is allways to be started with the recover (-z) switch. This will invalidate your cache at startup. This restriction may be dropped in the future.

To enable more debuggning run arla with the switch --debug=module1,module2,... One useful combination is
 
   --debug=all,-cleaner 
The cleaner output is usully not that intresting and can be ignored.

A convenient way to debug arlad is to start it inside gdb.
 
datan:~# gdb /usr/arla/libexec/arlad
(gdb) run -z -n
This gives you the opportunity to examine a crashed arlad.
 
(gdb) bt
The arla team appreciates cut and paste information from the beginning to the end of the bt output from such a gdb run.

To set the debugging with a running arlad use fs arladeb as root.

 
datan:~# fs arladeb
arladebug is: none
datan:~# fs arladeb almost-all
datan:~#

By default, arlad logs through syslog if running as a daemon and to stderr when running in the foreground (with --no-fork).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2 Debugging LWP with GDB

For easy tracing of threads we have patch (http://www.stacken.kth.se/projekt/arla/gdb-4.18-backfrom.diff) for gdb 4.18 (a new command) and a gdb sequence (think script).

The sequence only works for i386, but its just matter of choosing different offset in topstack to find $fp and $pc in the lwp_ps_internal part of the sequence.

You should copy the `.gdbinit' (that you can find in the arlad directory in the source-code) to your home-directory, the directory from where you startat the patched gdb or use flag -x to gdb.

Your debugging session might look like this:

 
(gdb) lwp_ps
Runnable[0]
 name: IO MANAGER
  eventlist:
  fp: 0x806aac4
  pc: 0x806aac4
 name: producer
  eventlist: 8048b00
  fp: 0x8083b40
  pc: 0x8083b40
Runnable[1]
[...]
(gdb) help backfrom
Print backtrace of FRAMEPOINTER and PROGRAMCOUNTER.

(gdb) backfrom 0x8083b40 0x8083b40
#0  0x8083b40 in ?? ()
#1  0x8049e2f in LWP_MwaitProcess (wcount=1, evlist=0x8083b70)
    at /afs/e.kth.se/home/staff/lha/src/cvs/arla-foo/lwp/lwp.c:567
#2  0x8049eaf in LWP_WaitProcess (event=0x8048b00)
    at /afs/e.kth.se/home/staff/lha/src/cvs/arla-foo/lwp/lwp.c:585
#3  0x8048b12 in Producer (foo=0x0)
    at /afs/e.kth.se/home/staff/lha/src/cvs/arla-foo/lwp/testlwp.c:76
#4  0x804a00c in Create_Process_Part2 ()
    at /afs/e.kth.se/home/staff/lha/src/cvs/arla-foo/lwp/lwp.c:629
#5  0xfffefdfc in ?? ()
#6  0x8051980 in ?? ()

There also the possibility to run arla with pthreads (run configure with --with-pthreads).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3 xfs

XFS debugging does almost look the same on all platforms. They all share same debugging flags, but not all are enabled on all platforms.

Change the debugging with the fs xfsdebug command.

 
datan:~# fs xfsdebug
xfsdebug is: none
datan:~# fs xfsdebug almost-all
datan:~#

If it crashes before you have an opportunity to set the debug level, you will have to edit `xfs/your-os/xfs_deb.c' and recompile.

The logging of xfs ends up in your syslog. Syslog usully logs to /var/log or /var/adm (look in /etc/syslog.conf).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.4 xfs on linux

There is a problem with klogd, it's too slow. Cat the `/proc/kmsg' file instead. Remember to kill klogd, since the reader will delete the text from the ring-bufer, and you will only get some of the message in your cat.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.5 Debugging techniques

Kernel debugging can sometimes force you to exercise your imagination. We have learned some different techniques that can be useful.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.5.1 Signals

On operatingsystems with kernel debugger that you can use probably find where in the kernel a user-program live, and thus deadlocks or trigger the bad event, that later will result in a bug. This is a problem, how do you some a process to find where it did the intresting thing when you can't set a kernel breakpoint ?

One way to be notified is to send a signal from the kernel module (psignal() on a BSD and force_sig() on linux). SIGABRT() is quite useful if you want to force a coredump. If you want to continue debugging, use SIGSTOP.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.5.2 Recreateable testcases

Make sure bugs don't get reintroduced.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6 Kernel debuggers

Kernel debuggers are the most useful tool when you are trying to figure out what's wrong with xfs. Unfortunately they also seem to have their own life and do not always behave as expected.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.1 Using GDB

Kernel debugging on NetBSD, OpenBSD, FreeBSD and Darwin are almost the same. You get the idea from the NetBSD example below:

 
  (gdb) file netbsd.N
  (gdb) target kcore netbsd.N.core
  (gdb) symbol-file /sys/arch/i386/compile/NUTCRACKER/netbsd.gdb

This example loads the kernel symbols into gdb. But this doesn't show the xfs symbols, and that makes your life harder.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.2 Getting all symbols loaded at the same time

If you want to use the symbols of xfs, there is a gdb command called `add-symbol-file' that is useful. The symbol file is obtained by loading the kernel module xfs with `kmodload -o /tmp/xfs-sym' (Darwin) or `modload' (NetBSD and OpenBSD). FreeBSD has a linker in the kernel that does the linking instead of relying on `ld'. The symbol address where the module is loaded get be gotten from `modstat', `kldstat' or `kmodstat' (it's in the area field).

If you forgot the to run modstat/kldstat/kmodstat, you can extract the information from the kernel. In Darwin you look at the variable kmod (you might have to case it to a (kmod_info_t *). We have seen gdb loose the debugging info). kmod is the start of a linked list. Other BSDs have some variant of this.

You should also source the commands in /sys/gdbscripts (NetBSD), or System/xnu/osfmk/.gdbinit (Darwin) to get commands like ps inside gdb.

 
  datan:~# modstat Type Id Off Loadaddr Size Info Rev Module
  Name DEV 0 29 ce37f000 002c ce388fc0 1 xfs_mod [...]
  [...]
  (gdb) add-symbol-table xfs.sym ce37f000


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.3 Debugging processes, examine their stack, etc

One of diffrencies between the BSD's are the proc, a command that enables you do to a backtrace on all processes. On FreeBSD you give the proc command a `pid', but on NetBSD and OpenBSD you give a pointer to a struct proc.

After you have ran proc to set the current process, you can examine the backtrace with the regular backtrace command.

Darwin does't have a proc command. Instead you are supposed to use gdb sequences (System/xnu/osfmk/.gdbinit) to print process stacks, threads, activations, and other information.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.4 Debugging Linux

You can't get a crashdump for linux with patching the kernel. There are two projects have support for this. Mission Critical Linux http://www.missioncritiallinux.com and SGI http://oss.sgi.com/.

Remember save the context of /proc/ksyms before you crash, since this is needed to figure out where the xfs symbols are located in the kernel.

But you can still use the debugger (or objdump) to figure out where in the binary that you crashed.

ksymoops can be used to create a backtrace.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.5 Using adb

Adb is not a symbolic debugger, this means that you have to read the disassembled object-code to figure out where it made the wrong turn and died. You might be able to use GNU objdump to list the assembler and source-code intertwined (`objdump -S -d mod_xfs.o'). Remember that GNU binutils for sparc-v9 isn't that good yet.

You can find the script that use use for the adb command `$<' in `/usr/lib/adb' and `/usr/platform/PLATFORNAME/adb'.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.6 Debugging a live kernel

An important thing to know is that you can debug a live kernel too, this can be useful to find dead-locks. To attach to a kernel you use a command like this on a BSD system (that is using gdb):

 
  (gdb) file /netbsd
  (gdb) target kcore /dev/mem
  (gdb) symbol-file /sys/arch/i386/compile/NUTCRACKER/netbsd.gdb

And on Solaris:

 
  # adb -k /dev/ksyms /dev/mem


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6.7 Other useful debugging tools

Most diagnosics tools like ps, dmesg, and pstat on BSD systems used to look in kernel memory to extract information (and thus earned the name kmem-groovlers). On some systems they have been replaced with other method of getting their data, like /proc and sysctl.

But due to their heritage they can still be used in with a kernel and coredump to extract information on some system.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.7 Darwin/MacOS X

You'll need two computers to debug arlad/xfs on darwin since the common way to debug is to use a remote kernel-debugger over IP/UDP.

First you need to publish the arp-address of the computer that you are going to crash.

We have not found any kernel symbols in MacOSX Public Beta, so you should probably build your own kernel. Use Darwin xnu kernel source with cvs-tag: Apple-103-0-1 (not xnu-103).

 
gdb xfs.out
target remote-kdp
add-symbol-table ...
attach <host>


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated on January, 8 2001 using texi2html