Table of Contents

Apache - Troubleshooting - Determine why an Apache process hung

Sometimes a web server will just hang, or hard crash, or return malformed content.

Finding the reason for this typically requires reviewing each stack frame that led to the current frame, and selectively dumping server data structures.

Debugging in this manner takes time, and may not be appropriate for sites that require constant availability.

To determine the cause of this issue, view the stack backtrace of the hung process.


NOTE: Stack Backtrace

There are several tools that can be used to get a stack backtrace, including:

  • pstack: attaches to the active processes named by the pids on the command line, and prints out an execution stack trace, including a hint at what the function arguments are.
  • gcore: generate a core file from a hung web server process.
  • gdb: the GNU Project debugger, allows you to see what is going on 'inside' another program while it executes.

The results from these methods could be fed into the Apache bug database to see if the problem is caused by a well known issue.


Use pstack to prints a stack backtrace for the process id

pstack 42367

returns:

42367:  bin/httpd -k start
 ff040628 accept   (3, 11c560, 11c54c, 1)
 0004c3c4 unixd_accept (ffbff904, 7d490, 11c3a0, 0, 2710, 0) + 10
 0004a3c0 child_main (7d490, 74400, 4e2e, 74000, 0, 74000) + 2ec
 0004a6c8 make_child (4a000, 0, 1, 5, 72c00, 74000) + ec
 0004b0e8 ap_mpm_run (72c00, 74000, 74000, 74000, 74000, 74400) + 934
 000272d8 main     (7ef18, 71c00, 73800, 73800, 0, 0) + 710
 00026618 _start   (0, 0, 0, 0, 0, 0) + 5c

NOTE: This shows:

  • accept: Apache was in the accept() system call when the error was received.

Some systems do not have the pstack utility. In these cases try using the gdb and gcore utilities to get a stack backtrace from a process.


Attach directly to a process with gdb and retrieve a stack backtrace

The gdb utility can be run with the -p option and a process identifier.

The backtrace command can be run in the gdb shell:

gdb -q -p 3472

returns:

(gdb) backtrace
#0  0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0063b681 in accept () from /lib/tls/libpthread.so.0
#2  0x00b14814 in apr_socket_accept (new=0xbff85740, sock=0x9671538,
    connection_context=0x97115d8) at network_io/unix/sockets.c:187
#3  0x080819ce in unixd_accept (accepted=0xbff85774, lr=0x9671518, ptrans=0x97115d8) at unixd.c:466
#4  0x0807fd2e in child_main (child_num_arg=Variable "child_num_arg" is not available.) at prefork.c:621
#5  0x0807ffc2 in make_child (s=Variable "s" is not available.) at prefork.c:736
#6  0x08080050 in startup_children (number_to_start=5) at prefork.c:754
#7  0x0808089b in ap_mpm_run (_pconf=0x96730a8, plog=0x96a1160, s=0x9674f48) at prefork.c:975
#8  0x08061b08 in main (argc=3, argv=0xbff85a84) at main.c:717

NOTE: This shows:

  • accept (): Apache was in the accept() system call when the SIGSEGV signal was received.
    • accept() was called by the portable runtime method apr_socket_accept().

Use gcore utility to force a hung process to dump its core

gcore 4932

Use gdb utility to retrieve a stack backtrace from the core file

gdb -q /usr/sbin/httpd core.4932

returns:

(gdb) backtrace
#0  0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0063b681 in accept () from /lib/tls/libpthread.so.0
#2  0x00b14814 in apr_socket_accept (new=0xbff85740, sock=0x9671538, 
    connection_context=0x97115d8) at network_io/unix/sockets.c:187
#3  0x080819ce in unixd_accept (accepted=0xbff85774, lr=0x9671518, ptrans=0x97115d8) at unixd.c:466
#4  0x0807fd2e in child_main (child_num_arg=Variable "child_num_arg" is not available.) at prefork.c:621
#5  0x0807ffc2 in make_child (s=Variable "s" is not available.) at prefork.c:736
#6  0x08080050 in startup_children (number_to_start=5) at prefork.c:754
#7  0x0808089b in ap_mpm_run (_pconf=0x96730a8, plog=0x96a1160, s=0x9674f48) at prefork.c:975
#8  0x08061b08 in main (argc=3, argv=0xbff85a84) at main.c:717

NOTE: This shows:

  • accept (): Apache was in the accept() system call when the SIGSEGV signal was received.
    • accept() was called by the portable runtime method apr_socket_accept().