Debug HOWTO
From Aquila Homepage
Contents |
What is this?
This is a short HOWTO that will explain you how to gather the information I need to fix the software in case of a crash. If you collect the information requested here, I can figure out what went wrong and fix the problem. It might take some effort on your part too, but in the end it will help to solve your problem.
What do I need?
You will need the aquila source and gdb, the GNU Debugger. Please consult your platform (linux distribution) documentation on how to install it.
What do I do?
First of all, configure the hub software with the extra --enable-debug option.
./configure --enable-debug
If you needed options to build the first time you build the hub, you will need them now too. Then build and install the new hub:
make install
Then, in your configuration directory, create a file called "core" and make it writable for everyone.
touch core chmod a+w core
Before you start the hub, do not forget to set the maximum size of the core file. On most systems this is 0 byte default.
ulimit -c 1000000
Online debugging
Now, if you can afford to keep a shell open at all times, run the hub under the debugger:
# gdb aquila GNU gdb 6.4 Copyright 2005 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. (gdb)
At the prompt, enter "run" to start the hub, but first we put a breakpoint at the CrashHandler to prevent debugger collisions.
(gdb) b CrashHandler Breakpoint 1 at 0xXXXXXXXX: file stacktrace.c, line 694. (gdb) run Starting program: /usr/local/bin/aquila ...
Now wait until it crashes or trigger the crash if you know how. When the hub has crashed, type bt and info locals. If gdb replies with "No symbol table info available.", try up and then repeat bt and info locals until it replies with something else.
...
Program received signal SIGSEGV, Segmentation fault.
0x00002b75f9ae9696 in strcpy () from /lib/libc.so.6
(gdb) bt
#0 0x00002b75f9ae9696 in strcpy () from /lib/libc.so.6
#1 0x000000000041113c in handler_bug (user=0x5998a0, output=0x5b81b0, priv=0x0, argc=1,
argv=0x7fffb182e040) at builtincmd.c:1382
#2 0x000000000040d8d1 in cmd_parser (user=0x5998a0, target=0x0, priv=0x0, event=2,
token=0x597b90) at commands.c:272
#3 0x000000000040dafa in cmd_parser_main (user=0x5998a0, priv=0x0, event=2, token=0x597b90)
at commands.c:335
#4 0x000000000040c736 in plugin_send_event (priv=0x599890, event=2, token=0x597b90)
at plugin.c:808
#5 0x0000000000415c18 in proto_nmdc_state_online (u=0x599560, tkn=0x7fffb182eaa0,
b=0x597b90) at nmdc_protocol.c:1583
#6 0x0000000000417c44 in proto_nmdc_handle_token (u=0x599560, b=0x597b90)
at nmdc_protocol.c:2370
#7 0x00000000004192de in proto_nmdc_handle_input (user=0x599560, buffers=0x599510)
at nmdc_protocol.c:2792
#8 0x0000000000407fb1 in server_handle_input (s=0x5997f0) at hub.c:432
#9 0x0000000000404dc5 in esocket_select (h=0x5979b0, to=0x7fffb182ed20) at esocket.c:891
#10 0x00000000004255e1 in main (argc=1, argv=0x7fffb182ee38) at main.c:131
(gdb) info locals
No symbol table info available.
(gdb) info args
No symbol table info available.
(gdb) up
#1 0x000000000041113c in handler_bug (user=0x5998a0, output=0x5b81b0, priv=0x0, argc=1,
argv=0x7fffbbbff410) at builtincmd.c:1382
1382 strcpy ((void *) 1L, "");
(gdb) info locals
No locals.
(gdb) info args
user = (plugin_user_t *) 0x5998a0
output = (buffer_t *) 0x5b81b0
priv = (void *) 0x0
argc = 1
argv = (unsigned char **) 0x7fffbbbff410
(gdb)
If you wish, you can repeat this a few times. The more information I get, the more likely I will be able to fix the problem.
(gdb) up
#2 0x000000000040d8d1 in cmd_parser (user=0x5998a0, target=0x0, priv=0x0, event=2,
token=0x597b90) at commands.c:272
272 cmd->handler (user, output, priv, argc, argv);
(gdb) info locals
output = (buffer_t *) 0x5b81b0
hash = 1239898521
i = 256
local = (buffer_t *) 0x5b6160
c = (unsigned char *) 0x5b61ab ""
t = 0 '\0'
argc = 1
argv = {0x5b61a8 "bug", 0x0 <repeats 255 times>}
cmd = (command_t *) 0x54f310
list = (command_t *) 0x538ef8
(gdb) info args
user = (plugin_user_t *) 0x5998a0
target = (plugin_user_t *) 0x0
priv = (void *) 0x0
event = 2
token = (buffer_t *) 0x597b90
(gdb)
Offline debugging
If you cannot afford to keep a shell open, just start the hub as you normally would and wait until it crashes. Then you can retrieve the information from the core file.
gdb aquila core
You get a slightly different output than above:
GNU gdb 6.4 Copyright 2005 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1". Core was generated by `./aquila'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib64/libz.so.1...done. Loaded symbols for /lib/libz.so.1 Reading symbols from /usr/lib64/liblua.so...done. Loaded symbols for /usr/lib/liblua.so Reading symbols from /usr/lib64/liblualib.so...done. Loaded symbols for /usr/lib/liblualib.so Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib/libm.so.6 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib64/libcrypt.so.1...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib64/libreadline.so.5...done. Loaded symbols for /lib/libreadline.so.5 Reading symbols from /lib64/libhistory.so.5...done. Loaded symbols for /lib/libhistory.so.5 Reading symbols from /lib64/libncurses.so.5...done. Loaded symbols for /lib/libncurses.so.5 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 #0 0x00002b21cbce1696 in strcpy () from /lib/libc.so.6 (gdb)
Now, the command works exactly as above. Type bt and info locals on the gdb prompt. You can do up and repeat the process a few times if you want.
(gdb) bt #0 0x00002b21cbce1696 in strcpy () from /lib/libc.so.6 #1 0x0000000000412106 in handler_bug (user=0x5e7a70, output=0x5e8c60, priv=0x0, argc=1, argv=0x7fffdf632f20) at builtincmd.c:1430 #2 0x000000000040e5cd in cmd_parser (user=0x5e7a70, target=0x5af4b0, priv=0x0, event=4, token=0x5e3e60) at commands.c:272 #3 0x000000000040e7af in cmd_parser_priv (user=0x5af4b0, priv=0x0, event=4, token=0x5e3e60) at commands.c:327 #4 0x000000000040d37d in plugin_send_event (priv=0x5af4a0, event=4, token=0x5e3e60) at plugin.c:827 #5 0x00000000004186b3 in proto_nmdc_state_online (u=0x5d35d0, tkn=0x7fffdf6339b0, b=0x5e3e60) at nmdc_protocol.c:2189 #6 0x000000000041902c in proto_nmdc_handle_token (u=0x5d35d0, b=0x5e3e60) at nmdc_protocol.c:2399 #7 0x000000000041a6de in proto_nmdc_handle_input (user=0x5d35d0, buffers=0x5b23d0) at nmdc_protocol.c:2825 #8 0x000000000040873e in server_handle_input (s=0x5e3fe0) at hub.c:491 #9 0x00000000004052ed in esocket_select (h=0x5aefe0, to=0x7fffdf636b90) at esocket.c:1019 #10 0x0000000000427a20 in main (argc=1, argv=0x7fffdf636ca8) at main.c:252 ...
If you have a problem with the hubs debug output, you can solve that in two ways. (Please note that daemontools users will not have any problems with output). First, start the hub in daemon mode (preferred, all output will be discarded):
aquila -d
Configure the hub without debugging features but with debugging information. This method is better if you have cpu problems or if the bug doesn't seem to happen on a debug build.
CFLAGS="-ggdb -O0" ./configure
The examples are taken from the command "bug" (which will crash the hub).
What do I with all this then?
You send it to me! Post this data on the forum with a description of how you triggered the problem (if you know) and if it is none-empty, save a copy of the core file and the aquila executable you used, I might ask for them if the backtrace is not enough.

