The language of shells

Making sense of shell commands

Summary
Working with shells can be difficult, as they require unusual and specific combinations of words and punctuation. This month, Mo Budlong helps you out by explaining some basic commands, such as ls, echo, and man. Also, Mo corrects a problem from May's Unix 101 in a sidebar. (1,300 words)

From the end user's perspective, the shell is the most important program on the Unix system because it is the user's interface to the Unix system kernel. The shell reads and interpreting strings of characters and words.

The shells operate in a simple loop:

Accept a command
Interpret the command
Execute the command
Wait for another command

The shell displays a prompt, notifying the user that it is ready to accept a command. It would be nice if you could speak or type instructions into the computer in some form of natural language.

OK, Hal. Sort out my correspondence, throw out anything
that is too old, and archive the rest.

Unfortunately, the shell recognizes a very limited set of command words, so the user must offer commands in a way that it understands. This means learning to string odd words and punctuation together.

Each shell command consists of a command name, followed, if desired, by command options and arguments. The command name, options, and arguments are separated by blank space.

The shell is one of many programs that the Unix kernel can run for you. When the kernel is running a program, that program is called a process. The kernel can run the same program many times (one shell for each user), and each running copy of the program is a separate process. Because each user runs a separate copy of the shell, each user is running in his or her own process space.

Many basic shell commands are subroutines that are built in to the shell program. The echo command is almost always built in to a shell.

$ echo "Hello, Hal"
Hello Hal
$

Commands not built in to the shell require that the kernel start another process in order to run.

When you execute a command that is not built in to a shell, the shell asks the kernel to create a new subprocess (or child process) to perform the command. The child process exists just long enough to execute the command. The shell waits for the child process to finish before accepting the next command.

The basic form of a Unix command is:

command name [-options] [arguments]

The square brackets signify parts of the command that may be omitted.

The command name is the name of a built-in command or a separate program you want the shell to execute. The command options, usually indicated by a dash, allow you to alter the behavior of the command. The arguments are the names of files, directories, or programs that the command needs to access.

ls -l /home/mjb

The ls command is usually a separate program rather than a built-in command. The command above will get you a long listing of the contents of the /home/mjb directory. In this example, ls is the command name, -l is an option that tells ls to create a long, detailed output, and /home/mjb is an argument naming the directory that ls is to list.

The Unix shell is case sensitive, and most Unix commands are lower case.

Some of the more popular shells are sh (the Bourne shell), ksh (the Korn shell), csh (the C shell), bash, (the Bourne Again shell), pdksh (the Public Domain Korn shell), and tcsh (the Tiny C shell).

You can frequently identify your shell by typing:

echo $SHELL

Unix recognizes certain special characters as command directives. If you use a special character in a command, make sure you understand what it does. The special characters are / < > ! $ % ^ & * | { } ~ and ;. When naming files and directories on Unix, it is safest to only use numerals, upper and lower case letters, and the period, dash, and underscore characters.

A Unix command line is a sequence of characters in the syntax of the target shell language. Of the characters in a command line, some are known as metacharacters. Metacharacters have a special meaning to the shell. The metacharacters in the Korn shell are:

; -- Separates multiple commands on a command line

& -- Causes the preceding command to execute asynchronously (as its own separate process so that the next one does not wait for it to complete)

() -- Enclose commands that are to be launched in a separate shell

| -- Pipes the output of the command to the left of the pipe to the input of the command on the right of the pipe

> -- Redirects output to a file or device

>> -- Redirects output to a file or device and appends to it instead of overwriting it

< -- Redirects input from a file or device

newline -- Ends a command or set of commands

space -- Separates command words

tab -- Separates command words

Some metacharacters can be used in combinations, such as ||, &&, and >>. With these metacharacters you can define a command-line word, which is a sequence of characters separated by one or more nonquoted metacharacters.

To access the online manuals, use the man command, followed by the name of the command you need help with. For instance, to see the manual for the ls command, enter:

man ls

End of article.

A note to my readers

I would like to note a correction to the May edition of Unix 101, in which I said:

"Once a shell variable has been exported and becomes an environment variable, it can be modified by a subshell. The modification affects the environment variable at all levels where the environment variable has scope."

Several sharp eyed readers picked up on this and sent comments ranging from, "Oh, no, you can't" to "Gee, whiz, which shell are you using? It doesn't work for me."

They are right. A subshell cannot modify an environment variable and return it to the parent. It can modify an environment variable and pass it on to a child process, but it cannot return the new value to a higher level. To illustrate this correctly, create the following three script files and grant them execute privileges using chmod a+x script*.

# script1
myvar="Hello" ; export myvar
echo "script1:myvar=" $myvar
./script2
echo "Back from script1 and script2
echo "script1:myvar=" $myvar

# script2
myvar="Goodbye"
echo "script2:myvar=" $myvar
./script3

# script3
echo "script3:myvar=" $myvar

If you run this sequence, the results show that $myvar exists in all three scripts (and, consequently, in all three processes), but modifying it in script2 only affects its value in script3.

$ ./script1
script1:myvar= Hello
script2:myvar= Goodbye
script3:myvar= Goodbye
Back from script 1 and 2
script1:myvar= Hello
$

My apologies to those of you who tried to make the example in the May issue work.

Return to article

Contact us for a free consultation.

MENU:

SOFTWARE DEVELOPMENT:

• EXPERIENCE

PRODUCTS:

UNIX:

• UNIX TUTORIALS

LEGACY SYSTEMS:

    • LEARN COBOL
    • PRODUCTS
    • GEN-CODE
    • COMPILERS

INTERNET:

• CYBERSUITE

WINDOWS:

• PRODUCTS