A few years ago, I was taking a few classes from a very up and coming computer science university and during the course of one projects class came to discuss with the instructor the differences between the average students and those excelling. We determined the main differences to be, three in number:
- Time and dedication
- The ability to tinker and figure things out on one’s own
- The art of debugging.
The art of debugging is really quite simple, it is mostly knowing the tools available and how to use them. On the linux platform, surely one of the most powerful and certainly most thorough debugging tool one can use is strace.
Strace has been around since the early 90s and yet it seems not many know much about it. Basically, strace is a small program that traces, follows, and displays the system calls made by any program. You can find out what any program is doing at any time using Strace. How cool is that? And it’s very detailed, follows child threads if desired, and makes an extra pot of coffee, which you’ll need when looking through everything it generates.
In the spirit of tinkering and figuring things out, let’s run it on an already running process…
[root@main ~]# strace -v -s 10000 -f -p 17161 Process 17161 attached - interrupt to quit select(0, NULL, NULL, NULL, {0, 331000}) = 0 (Timeout) time(NULL) = 1213802387 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802388 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802389 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802390 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802391 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802392 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802393 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802394 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1213802395 waitpid(-1, 0xbfe3aeac, WNOHANG) = 0 select(0, NULL, NULL, NULL, {1, 0} Process 17161 detached
Alright, cool. First, the parameters I sent to the strace command were these:
- -v : verbose mode, print out as much as possible
- -s 10000 : print up to 10000 character strings (default is 32 and so you get some cut off strings)
- -f : Follow forked processes, ie children processes and threads
- -p 17161 : attach to process with the id of 17161. Great for tracing processes like apache or cpanel’s processes to see what they are doing that is causing problems or doing something you want to duplicate programatically.
Since I was just tracing the apache process, it didn’t do much except wait. You can also just specify a process to run instead of a pid, like so :
[root@main ~]# strace -v -s 10000 -f perl -e 'print "hi!n";' execve("/usr/bin/perl", ["perl", "-e", "print "hi!\n";"], ["HOSTNAME=main.highspeedweb.net", "SHELL=/bin/bash", "TERM=linux", "HISTSIZE=500", "USER=root", "LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:", "MAIL=/var/spool/mail/root", "PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/bin", "INPUTRC=/etc/inputrc", "PWD=/root", "LANG=en_US", "SHLVL=1", "HOME=/root", "LOGNAME=root", "LESSOPEN=|/usr/bin/lesspipe.sh %s", "G_BROKEN_FILENAMES=1", "_=/usr/bin/strace"]) = 0 uname({sysname="Linux", nodename="main.highspeedweb.net", release="2.6.9-55.0.12.ELsmp", version="#1 SMP Fri Nov 2 11:19:08 EDT 2007", machine="i686"}) = 0 brk(0) = 0x9757000 ... a bunch of other stuff... ... write(1, "hi!n", 4hi! ) = 4 exit_group(0) = ? Process 28721 detached
As you can see, it does quite a few things before getting perl to just print “hi!n” and strace shows you it all! Tinker around with it and in a later tutorial I will show you how to take strace output from cpanel to programatically run it’s functions in perl.
Have fun with it! Jay
Hmmm, this is the first time I’ve heard of Strace. I am currently on a huge Linux knowledge quest, as the time for upgrading operating systems is drawing nearer. Linux is really something special, and the work that has been put into it is quite staggering. It is soooo close to being accepted in the mainstream. I feel it wont be long for that tipping point to get reached now. Maybe that is what 2012 will all be about *chuckles*