Sending signals
There's more than one way to kill a process.
Which should you use?
Summary
This month we explain some basic Unix signals, specifically SIGTERM,
SIGQUIT, SIGINT, and SIGHUP. These signals can be used in conjunction with
the trap and kill commands to end processes in a variety of different ways. We show you how they work. (1,700 words)
In Unix, signals are sent to running processes to indicate that an event,
exterior to the process, has occurred and that the process must
respond.
The simplest example of this is the hang up signal, or SIGHUP in
Unix-speak. When a user is logged on from a remote terminal, the
line can hang up for a number of reasons. Phone outages, modem
problems, or a power loss at the remote terminal. All of these
conditions cause a program to go "out of control." In other words
a program or process that was being run from a terminal is no
longer under the control of that terminal. The program itself
needs to know what to do in that case. The Unix operating system
keeps track of which processes are being run by which terminal,
and when the terminal hangs up (drops the connection) the
operating system sends a SIGHUP signal to all the programs that
are launched from that terminal.
The process has three options when it receives the SIGHUP signal.
- The process does the default action, which is to stop
executing immediately.
- The process can be programmed to catch the signal (called
trapping the signal) and ignore it. The process continues
running.
- The process can be programmed to catch (trap) the signal and
do something sensible such as close all open files and exit.
All three options have their place depending on the type of
program that is being run.
The kill command
The kill command can be used to send a signal to a running process.
You can use ps -ef to locate a job number, and then use
kill to send a signal to that job. But before you start killing
processes, it is necessary to understand a little about what signals
do and how programs handle them.
In order to see some practical results, let's try a few simple
examples. First switch into the Korn shell by typing ksh and
pressing enter. For these examples I will be using a short script
and the kill command. The kill command lets you send a specific
signal to a process. The syntax for kill is:
kill -(SignalNumber) JobNumber
The hang up signal is 1 (one), so the command kill -1 1234
would send the hang up signal to job 1234 just as if the phone line
had been hung up.
Listing 1 is a short script that sleeps for five seconds, wakes up,
prints a message that it is awake and then goes back to sleep.
Type this script in and save it as waiter , and then change its
mode so that it can be executed by typing:
chmod a+x waiter
Listing 1: waiter
while true
do
sleep 5
echo "Awake"
done
You can run Listing 1 by detaching it from the terminal. To do
this type waiter followed by an ampersand (& ) on the command line. After you press Enter, something like the following will appear:
waiter &
[1] 4567
Make a note of the second number as it will vary. It is the process
ID or job number for the "waiter" process. This job is now running
as a background process, but because it is echoing the word "Awake"
every five seconds, Awake will appear on your terminal at five-second
intervals.
You can stop the job by typing the following command, but use the
actual process ID number that appeared when you started the job:
kill -1 4567
Don't worry if the word "Awake" butts into the middle of your
typing. It won't affect the command you are typing. Finish typing
the command and press Enter. The kill command sends the hang up
signal to waiter and waiter simply does the default action of
dying. At this point Awake stops appearing at your terminal.
If Awake continues to appear on your terminal, you should make sure
that you have noted the job number correctly and try the kill
command again. If that doesn't work, repeat the kill command but
change the -1 to -9. There will be more about -9 in just a
moment.
In Listing 2, waiter has been modified to trap the hang up
signal. First a function is created that prints the fact that a
hang up signal has been received. Then the trap command is used
to set up a trap for the hang up signal. Instead of quitting the
program executes the function echo01 and then continues.
Listing 2: Waiter with a trap
function echo01 {
echo "Received signal 1 (SIGHUP)"
}
trap echo01 1
while true
do
sleep 5
echo "Awake"
done
Now if you start waiter with an ampersand and try to kill it with -1, your screen will look something like this.
$ waiter &
[1] 951
$ Awake
Awake
kill -1 951
$ Received signal 1 (SIGHUP)
Awake
Awake
The trap in the program now catches the signal 1 and simply
displays a message and continues running. You can stop the
program by sending a different signal such as kill -2 or
kill -9 .
This technique is used in large applications. There the program is
frequently in the middle of
important or complicated actions, like closing open files, that should not
be handled in
"drop dead" fashion.
Listing 3 is closer approximation of how a large application
would handle a hang up signal.
Listing 3: Waiter with a bigger trap
function echo01 {
echo "Received signal 1 (SIGHUP)"
echo "Now I would close files if I had any open."
exit
}
trap echo01 1
while true
do
sleep 5
echo "Awake"
done
Start this latest version of waiter with an ampersand and try to
kill it with -1. Your screen will look something like this, and the
program will stop executing:
$ waiter &
[1] 951
$ Awake
Awake
kill -1 951
$ Received signal 1 (SIGHUP)
Now I would close files if I had any open.
SIGKILL: The command that cannot be ignored
So what does kill -9 do? Signal 9 is SIGKILL, and it
cannot be trapped. If you send a signal 9 to a process you are
telling the operating system to cut it off at the knees -- drop dead
now. The advantage of signal 9 is that the program cannot trap it
and ignore it. The disadvantage of signal 9 is that the program
cannot intercept it and perform an orderly shut down even if it
needs to.
Listing 4 is waiter, modified with a trap for signal 9.
Listing 4: Waiter trying to trap signal 9
function echo09 {
echo "Received signal 9 (SIGKILL)"
echo "Now I would close files if I had any open."
exit
}
trap echo09 9
while true
do
sleep 5
echo "Awake"
done
Start version 4 of waiter with an ampersand and try to kill it
with -9. Your screen will look something like this, and the
program will stop executing:
$ waiter &
[1] 1151
$ Awake
Awake
kill -9 1151
$
There is no friendly message, no "now I am trying to close files"
information on the screen. The process died where it stood on
receipt of a signal 9 even though a trap was prepared for it.
Using kill -9 on a process that controls a database application
or a program that updates files can be disastrous.
Most well behaved processes are written to allow an orderly shut
down when some signal other than 9 is received. Signal 1, SIGHUP,
is possibly the most common signal used for an orderly shut down.
Many applications intercept and shut down correctly for most
signals, so try 1 and others below before you try 9.
Signals you should know
Here are some of the other common signals and what causes them to
be generated.
1 SIGHUP, hang up -- Caused by the
phone line or terminal connection being dropped.
2 SIGINT, interrupt -- Generated
from the user keyboard usually by a Control-C, Backspace or Delete. To
find out which, type stty -a and press Enter. In the
listing you will find intr = DEL , or intr =
^C , or intr = ^H , or something similar.
3 SIGQUIT, quit -- Also generated
from the keyboard usually by Ctrl-\ or Ctrl-Y. To find out which,
type stty -a and press Enter. In the listing you will
find quit = ^\ , or quit = ^Y or something
similar. A SIGQUIT often causes a core file to be created. This is a
copy of your current memory.
15 SIGTERM, software terminate --
This is usually generated by another program. In fact the kill
command uses 15 as the default. If you specify kill job, with no
signal number, kill sends a signal 15 to the job. Using kill without
a signal number is usually a good place to start on killing a
process.
It requires extra time and coding to write a trap for a signal
into a program. When a trap has been written into a program, it
is usually for good reason. If the program can simply die without
doing any cleanup, then why go to the trouble of including a
trap? That's why it is a good idea to try 15 and 1 and 2 before
ever resorting to 9.
Contact
us for a free consultation. |