The shell's parser performs several operations on your commands before finally executing them.
Understanding how your original command will be transformed by the shell is of paramount importance in writing robust scripts.
NOTE: Shell commands execute some program with a specific set of arguments (as well as setting up environment variables, opening file descriptors, etc.).
Word splitting is performed on the results of almost all unquoted expansions.
From the bash man page:
The order of expansions is: brace expansion, tilde expansion, parameter, variable and arithmetic expansion and command substitution (done in a left-to-right fashion), word splitting, and pathname expansion. For additional information on word splitting and argument handling in Bash, consider reading Arguments.
Let's write a little helper script that will show us the arguments as passed by the shell:
#!/bin/sh - printf "%d args:" "$#" printf " <%s>" "$@" echo
chmod +x test.sh
./test.sh hello world "how are you?"
returns:
3 args: <hello> <world> <how are you?>
NOTE: The helper program above receives the argument list as constructed by the shell, and shows it to us.
If IFS is not set, then it will be performed as if IFS contained a space, a tab, and a newline.
For example:
var="This is a variable" test.sh $var
returns:
4 args: <This> <is> <a> <variable>
An example using IFS:
log=/var/log/qmail/current IFS=/ test.sh $log
returns:
args: <> <var> <log> <qmail> <current> unset IFS
ls -l
returns:
total 2864 -rw-r--r-- 1 greg greg 2919154 2001-05-23 00:48 Yello - Oh Yeah.mp3
Now run against the test script:
test.sh $(ls -l)
returns:
11 args: <-rw-r--r--> <1> <greg> <greg> <2919154> <2001-05-23> <00:48> <Yello> <-> <Oh> <Yeah.mp3>
As you can see above, we usually do not want to let word splitting occur when filenames are involved.
Double quoting an expansion suppresses word splitting, except in the special cases of "$@" and "${array[@]}":
var="This is a variable"; test.sh "$var"
returns:
1 args: <This is a variable>
array=(testing, testing, "1 2 3"); test.sh "${array[@]}"
returns:
3 args: <testing,> <testing,> <1 2 3>
NOTE: "$@" causes each positional parameter to be expanded to a separate word; its array equivalent likewise causes each element of the array to be expanded to a separate word.
There are very complicated rules involving whitespace characters in IFS. Quoting the man page again:
We won't explore those rules in depth here, except to note the part about sequences of non-whitespace characters.
If IFS contains non-whitespace characters, then empty words can be generated:
getent passwd sshd
returns:
sshd:x:100:65534::/var/run/sshd:/usr/sbin/nologin
Set IFS
IFS=:; test.sh $(getent passwd sshd)
returns:
7 args: <sshd> <x> <100> <65534> <> </var/run/sshd> </usr/sbin/nologin>
Unset IFS
unset IFS
NOTE: There was another empty word generated in one of our previous examples, where IFS was set to /.
Whitespace IFS characters get consolidated.
Pathname expansion happens after word splitting, and can produce some very shocking results:
getent passwd qmaild qmaild:*:994:998::/var/qmail:/sbin/nologin IFS=:; test $(getent passwd qmaild) 737 args: <qmaild> <00INDEX.lsof> <03> <037_ftpd.patch> ... unset IFS
The * word, produced by the shell's word splitting, was then expanded as a glob, resulting in several hundred new and exciting words.
files='*.mp3 *.ogg' test.sh $files 2 args: <Yello - Oh Yeah.mp3> <*.ogg>
NOTE: Pathname expansion can be disabled with set -f or set -o noglob; though this can lead to surprising and confusing code.