Bash shell scripting

In the first part of this article series on shell scripting we covered basics of what shell scripting is and how we can use it like other programming languages to automate our work. Now we are going to cover some more advanced technique such as arrays, functions, networking etc which makes these shell scripts much more than just bunch of commands.

A re-look at shell scripting part-1

In last article we saw some basics of bash variable, how we can create variables assign values to them and retrieve back their values. Bash also support some operation on these variables. Here are some of the operations that can be performed on them.

var=”I am test string. You can run many operations on me”

String Length

$ echo ${#var}

Output: 51

Substring Operations

From given index

$ echo ${var:4}

Output: test string. You can run many operations on me

From given index of given length

$ echo ${var:4:5}

Output: test

Remove Suffix Pattern

$ echo ${var%%.*}

Output: I am test string

Remove Prefix Pattern

$ echo ${var##*.}

Output: You can run many operations on me

Substitution

$ echo ${var/on me/on this string}

Output: I am test string. You can run many operations on this string

Case Modification

We can specify a pattern after operator ( ^^ ,, and ~ ) to run this operation on part of string only.

To Uppercase

$ echo ${var^^}

Output: I AM TEST STRING. YOU CAN RUN MANY OPERATIONS ON ME

To Lowercase

$ echo ${var,,}

Output: i am test string. you can run many operations on me

To Title Case

$ echo ${var~}

Output: i Am Test String. you Can Run Many Operations On Me

Parsing Command Line Options

We have seen how we can pass command line arguments to our scripts. But if we want to support the POSIX style command line options that almost all the tools available in Linux follow i.e. the character ‘-’ followed by one or more characters with these options clubbed together or passed separately. then bash also provide tool to parse these options..

e.g.

$ls -la /home
$ ls -l -a /home

Both the above commands have same results and telling the ls program to provide output in list order and include all the files in the output.

Bash also provide tools to parse these POSIX style command line options. Bash built in command getopts is used to parse these arguments.

It is used as follows.

getopts OPTSTRING NAME [ARGS...]

OPTSRING : This is a list of all possible options your script supports. If any option requires value it is followed by a colon. e.g. “vhf:r”. Here it will parse option -v , -h -r, and -f option will need a value.


NAME : Name stores the option that is found.


ARG : If not defined “$@” is used otherwise we can specify a custom string to be parsed.

Here is a list of variables used by Getopts for parsing

OPTIND : This is the index of the next argument to be processed. If can set it to 1 to start parsing again.


OPTARG : It stores the value of the option. e.g. -f file.txt , here file.txt will be saved in OPTARG.


OPTERR : It can have two values If its value is 0 error messages will no be displayed. If its value is 1 error messages will be shown.

Error Reporting

If the first character of OPTSTRING is a colon, getopts uses silent error reporting otherwise it is verbose error mode

Silent Mode : In this mode, no error messages are printed. If an invalid option is seen, getopts places the option character found into OPTARG. If a required argument is not found, getopts places a ‘:’ into NAME and sets OPTARG to the option character found.

Verbose Mode : If invalid option is seen, getopts places ‘?’ into NAME and unsets OPTARG. If a required argument is not found, a ‘?’ is placed in NAME, OPTARG is unset, and a diagnostic message is printed.

#!/bin/bash
# cmd.sh -v -r -f <text>
while getopts ":vf:r" opt;do
    case $opt in
        v)
            echo "Opion v found"
        ;;
        f)
            echo "Opion f found with value $OPTARG"
        ;;
        r)
            echo "Opion r found"
        ;;
        : )
            echo "Option $OPTARG requires an argument." >&2
        \?)
            echo "Invalid option: -$OPTARG"
        ;;
    esac
done

There is one limitation with getopts that it can not handle long option name like -abc or –option. To handle long option names there is another tool getopt (note missing ‘s’) which is an external tool.

Arrays

Arrays are the variables which holds multiple values. Bash support single dimensional array of data which may have same or different types. There is no limit on the length of the array, the index in the array does not need to be continuous, there can be missing indexes, their index does not need to be start from 0.

Creating An Array

There are more than one way to create array in bash shell.

Creating Array beforehand

We can create array using the (). All the elements need to be separated by space, to specify a combined word use quotes to club them together.

 month_names=( "None", "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec")

The first element is placed at the index 0 and rest follows. There are no holes in the array. Notice that we need to use None at position 0 to shift all the values right.

Creating array on the run

You can directly assign any value to any index in the array. This way we can create any value at any index, we can create miss any index if we want to.

month_names[1]="Jan";

month_names[2]= "Feb";

month_names[3]= "Mar";month_names[4]= "Apr" month_names[5]= "May";month_names[6]= "Jun";

month_names[7]= "Jul";month_names[8]= "Aug" month_names[9]= "Sep";month_names[10]= "Oct";

month_names[11]= "Nov";month_names[12]= "Dec"

Notice we missed the index 0.

Declaring Array

We can also declare empty array using the keyword declare. declare -a months_array This creates an empty array with no values in it.

Accessing Array Elements

To access array use ${array[index]}

Print Whole array

$ echo ${month_names[@]}

Print one value

$ echo ${month_names[1]}

Length of Array

$ echo ${#month_names[@]}

Traversing an array

Traverse each value

for i in ${month_names[@]};do
    echo $i 
done

In this way if there is a value at some index in the array which have space in it that it will be considered as two separate strings.

Traverse each index

for ((i=0;i<${#month_names[@]};i++));do
    echo ${month_names[$i]}
done

Functions

In higher programming languages functions or subroutines are set of statements, they have a name which can be used in the code instead of original statements. They helps in reducing the redundancy in code and also makes the code maintainable. In Shell Scripting we can consider these functions as other shell scripts which runs in the same shell. They can access Environment Variables which are persistent and can create new variables, overall they behave like the internal shell commands.

Create Functions

As the scripts are not compiled but interpreted so it is necessary to create a function before using it. We can use functions in a script as well as the command line. In bash we can create a function by using the keyword function. function print() { echo -ne “$*” } This function will print its arguments and will not append a line break (-n option) and will also parse special characters (-e). You can see the commonly used functions defined in .bashrc or .bash_functions.

Using Functions

Once we have created these functions in our script we can use them same as any other command.

$ print 1 2 3 4
$ print “Test”

Command Line Arguments

Creating functions helps but it will not be so useful if we can not pass the arguments to our functions. Like the other commands there are no limits imposed on the number or arguments passed to a function, we can pass as many as we like. To handle these arguments in function we use them same way as we handle the command line arguments in a script, using $1, $2 $3 and so on, with one change that $0 remains the command you used to call your script and not the name of your function.

Returning From Function

Functions perform on the given input and return a output. There are two methods with which functions in bash cab return a value to its caller.

Return Statement

Like the return statement of other programming language you can use the return statement to return a value. Return statement sets the exit status of the function, as this can only be a numerical value so there is this limitation that it can only be integer value. The value returned can be accessed by the special variable ?.

Echo or Printf

There is one another way to return a value to its caller as used by the script we can print the value using echo or printf and then retrieve it in the caller function using $() construct or back-tick.

Stack Trace

If the $0 is the name of the script then how do we get the name of function we are in ? The default Bash array FUNCNAME is the stack of functions called. The value at index 0 is the name of current function. We can even get the depth of stack we are in by using ${#FUNCNAME[@]}.

Lets implement a stack using functions and array, we can add these function in .bashrc.

# Bash Stack Implementation using functions and array

declare -a __STACK__

__STACK_TOP__=0

function push()
{
    __STACK__[$__STACK_TOP__]="$*"
    __STACK_TOP__=$((__STACK_TOP__+1))
    return 0
}

function pop()
{
    if [ $__STACK_TOP__ -gt 0  ];then
        __STACK_TOP__=$((__STACK_TOP__-1))
        echo "${__STACK__[$__STACK_TOP__]}"
        unset __STACK__[$__STACK_TOP__]
        return 0
    fi
    return 1
}

function peek()
{
    if [ $__STACK_TOP__ -gt 0  ];then
        echo "${__STACK__[$((__STACK_TOP__-1))]}"
        return 0
    fi
    return 1
}

We can enter these functions in our bashrc file and then use them as regular commands. Apart from performing push, pop, and peek operation these functions are also returning the success or failure status, which can be accessed by $?.

Local Variables

Unlike other programming languages all the variables you create inside a function can be accessed outside of its boundaries. By default all the variables are available to whole script. But if we need function local variable which is not accessible outside of the function we can use the local keyword.

 local var

This statement inside a function will create a function local variable.

Exceptions or Traps

Most of the time our scripts are handling opened file which needs to be flushed before the scripts exits. There are some temporary files which needs to be do deleted, overall sometimes scripts needs to do some cleanup work before it exits. To handle these cases trap command creates signal handler which will be invoked when a particular signal is received.

$ trap command list of signals

command can be function some script or a simple command.

#!/bin/bash
#trap example

function runme
{
    echo "Signal received" exit 0
}

trap runme SIGINT SIGTERM

while true; do
    sleep 1
done

Input Output

We have used read, echo and printf for input output in our bash script. We have also used the redirection to modify the default input output stream. Lets see some more advanced input output methods provided by bash shell. Bash shell provide methods to handle file descriptors, open new file descriptors, redirect one in another and write in file opened with given file descriptor. By default Three file descriptors are opened for any program in linux.

File descriptor 0 is stdin or standard input
File descriptor 1 is stdout which is standard output
File descriptor 2 is stderr which is standard error device

Default all these are set as terminal for terminal programs, when we use the redirection the shell opens the file and redirect the input/output by duplicating the file descriptors. Same way we can open other files and manually redirect output, input and/or error to that file.

Open and Close File descriptors

The command exec is used to open and close file descriptor.

$ exec 5 <>file.txt

This command will open file file.txt and assign the file descriptor 5.

$ exec 5>&-

This command will close the file descriptor.

Redirection

To handle redirection we can use following format m>&n Redirect output of file descriptor m (ms is default to 1) to file decrioptor n m<&n Redirect input of file descriptor m ( ms is defualt to 0 ) to file decrioptor n

Redirecting Standard Output to file descriptor

$ date >&5

This command will redirect the output of date command to file descriptor 5.

Redirecting Standard Input from file descriptor

$ read line

This command will read the first line from file descriptor 5.

Redirecting Standard Error
As the standard error is just a output stream with file descriptor 2 we can easily redirect it to some other file decsriptor.

$ ls file.txt 2>&5

This command will redirect the standard error of ls to file descriptor 5.

Read Command Revisited

We have used the read command for taking input from terminal or reading a file using redirection, lets see some more stuffs that can be done with this command.

Read from file descriptor

read command can be used to read from a given file descriptor.

$ read line -u 5

Reading Multiple Values

We can use read command to read multiple variables at once

$ read a b c

Read Array From stdin

Read command can also be used to read an array from stdin and assign the values to successive indices in an array

$ read -a array

Inter Process Communication(IPC) with Named Pipe or FIFO

IPC or inter process communication is very essential when we want to distribute the jobs among various processes. We have already seen the use of pipe which is used to communicate between parent and child process. But If the processes are not related we can still communicate by using the named pipe or fifo. Pipes and Fifo are nothing but the redirection of the data streams
We can create a named pipe of Fifo using the command mknod or mkfifo.

$ mkfifo /tmp/test

This command will create a fifo file in /tmp directory.

We can see the file by using command

$ ls -la
/tmp/test prw-r--r-- 1 sherill hadmin 0 Dec 10 16:24 /tmp/test|

The initial p character shows its a fifo file.

Now open two terminals on one type

$ read line < /tmp/test

This read will blocked until it read something from fifo file.

In another terminal enter command

$ echo "Fifo Example" > /temp/test

This command will enter text in the fifo, and the read command on previous terminal will be completed. Now if you type echo $line on the previous terminal you will see that the data has successfully being read.

If you run the same command again the read will wait for the data, as the Fifo is empty due to last read.

We can use a timeout with read to stop hanging infinitely on the input.

for example

exec 7<> /tmp/test
while  true;do
    read -t0 -u7 line
    if [ $? -eq 0 ];then
        echo "$line is read"
        # Do something
    else
        echo " No Data. Sleep"
        # Do something else
    fi
done

Use of fifo

We have seen how we can use the redirection to redirect the input or output to or from a program using pipe. But this works only in one way i.e. either you redirect input from the command or you redirect output to that command. If we need to redirect the input to a program and along with its output to be read by same program then we can do this using fifo.
For example here we want to redirect output from netcat to shell and then read the output of this shell back so in netcat.
So we created a fifo and redirected the output of netcat command to fifo. Then we are reading this fifio using cat and redirectin it to shell which process the data and then its output is redirected to netcat again.

$ mkfifo /tmp/myfifo
$ cat /tmp/myfifo | /bin/sh  | nc -l 1356 >  /tmp/myfifo

This simple command will produce a netcat server running a shell which can be accesed using command

$ netcat localhost 1356

If you want to access the shell from a remote machine just change the localhost to the IP address of the machine and you get yourself a remote shell.

Indirect Reference

We can easily reference a variable value with $variable syntax, but if we need value of a value .i.e indirect reference. Bash also supports the indirect references with the help of eval command and \$$variable syntax. e.g.

x=10
y=x

echo $(eval echo \$$y)

We can use indirect references to create a look up table.

Lets create a program which will print the frequency of each word.

#!/bin/bash
# word_freq 
function usage
{
    echo "usage: $0 "
}

# Check argument is given and it is a file
if [ -z "$1" -o ! -f "$1" ];then
    usage
    exit 1
fi
# Declare array
declare -a map
# Open File
exec 7<> "$1"
index=0
while read -a array -u 7 ;do
    count=${#array[@]}
    # Traverse Each WOrd
    for((i=0;i<count;i++));do         
        elem=${array[$i]}         
        # Each Word stores its frequency          
        # So it can not be a special character or reserved words.         
        # Even if it matches vars that are used in this script it will not work         
        # e.g. If we use map in the input it will corrput the output,         
        # Add all checks here.                  

        # Check Word is not space         

       if [ ! -z "$elem" ];then             
          ecount=$(eval "echo \$$elem")             
          if [ -z "$ecount" ];then                 
              # Add the word in the map                 
              map[$index]=$elem                 
              index=$((index+1))             
          fi             
          # Frequency of that word             
          eval "$elem=$((ecount+1))"         
       fi     
   done     
   # Unset array     
   unset array 
done 
# Close File  
exec 7>&-
echo "Total $index words Found"
count=$index
# Print Frequency
for((i=0;i<$count;i++));do
    echo -n "${map[$i]}:"
    elem=${map[$i]}
    # Indirect Refenrence to word's frequency
    echo $(eval "echo \$$elem")
done

I ran this script with following input.

$ cat input.txt
This is an input to word frequency script
This script will occurrence the frequency of each word and report back
This script uses bash indirect references to create a Map and uses it to save occurrence

$ ./word_freq input.txt
Total 25 words Found
This:3
is:1
an:1
input:1
to:3
word:2
infrequence
script:3
will:1
occurrence:2
the:1
of:1
each:1
and:2
report:1
back:1
uses:2
bash:1
indirect:1
references:1
create:1
a:1
Map:1
it:1
save:1

This script is just an example of indirect reference and is very simple and will not handle many words for example if any of the variable ecount, map, count etc are in the input then it will not work correctly also it is not able to handle any special characters. But with little effort it can be made to handle all the cases this can be a good learning exercise.

Networking

There are two ways to use networking in your scripts

Bash internal socket handling

As we know all devices in linux are files, we can open then using open system call and can do basic input output. This can also be done with devices /dev/cp and /dev/udp.

From the bash manpage:

/dev/tcp/host/port
If host is a valid host name or Internet address, and port is an integer port number or service name, bash attempts to open a TCP connection to the corresponding socket.

/dev/udp/host/port
If host is a valid host name or Internet address, and port is an integer port number or service name, bash attempts to open a UDP connection to the corresponding socket.

For example to get home page of Google in your script, without using any other tool we can use following commands.

$ exec 7/dev/tcp/mylinuxbook.com/80
$ echo -en "GET / HTTP/1.0\r\nHost: mylinuxbook.com\r\nConnection: close\r\n\r\n">&7
$ cat &-

In the first line we are opening a connection with the server mylinuxbook.com at port 80 with the file descriptor 7, then we sent a Get request for Home page using HTTP protocol. Then are reading the server response using cat command. At last when its done we close the connection using exec command.

Using External Tools

Depending on need there are numerous tools available on linux to choose from. One among them is Netcat. Which is also called “Swiss-army knife” of networking which can be used to create TCP as well as UDP connections.

Lets create a simple time server using netcat.

At server (172.31.100.7) enter command

$ while true; do date | nc -u -l 13 ; done

This command will create a udp server listening at port 13. When any machine connects it will return the output of date command. And then exit. While loop is to create the server again.

Now to get time at any machine enter command

$ nc -u 172.31.100.7 13

This command will connect to serer 172.31.100.7 at the port 13 and retrieve print the data received from the server

To know more about it read netcat. Apart from necat wget and curl are also great tools to which are used used inside the scripts for handling networking.

Single Instance Programs (.lock File)

.lock files are used to constrained our script to run in single instance mode, specially in the case when multiple instance can corrupt the system state like aptd trying to install a software or updating. If multiple aptd are allowed to run in parallel they may corrupt the database. Usually lock files are present in the /var/lock or /tmp directory.

Lets create a script which allows only single instance to run.

#!/bin/bash
# single_instance [-n]
# If -n option given Kill
APP_NAME=single_instance
DIR=/tmp
LOCK_FILE=${DIR}/${APP_NAME}.lock
PID_FILE=${DIR}/${APP_NAME}.pid

# An Array for temporary files
declare -a temp_files

# Add Temporary files in the global array
function add_file()
{
    local index=${#temp_file}
    if [ ! -z "$1" -a -f "$1" ];then
        index=$((index+1))
        temp_files[$index]="$1"
    fi
}

# Perform the cleanup and exit
function cleanup_and_exit()
{
    local count=${#temp_file}
    for((i=0;i<count;i++))
    {
        rm -rf ${temp_files[$i]} >/dev/null 2>&1
    }
    exit 0
}

# Check if instance of scipt is already running
function check_single_instance()
{
    # Check Lock File
    if [ -f ${LOCK_FILE} ];then
        return 1
    fi
    return 0
}

# Add Signal Handler for SIGTERM
trap cleanup_and_exit SIGTERM

# Check if instance of this script is running
if check_single_instance ;then
    echo "Instance is running"
    if [ ! -z "$1" -a $1 = "-n" ];then
        # Kill Current Process
        read pid <<<$(cat ${PID_FILE})
            if [ ! -z "$pid" ];then
            # Send SIGTERM
            kill $pid
            while kill -0 $pid;do
            echo "Waiting for $pid to exit"
            done
    fi
    else
        exit 0
    fi
fi

# Create Lock File
touch ${LOCK_FILE}

# Create PID File
echo $$ > ${PID_FILE}

# Add files for Clean Up
add_file "${LOCK_FILE}"
add_file "${PID_FILE}"

# Add Your Script Functionality here
while true;
    echo "This is an example of single Instance Script"
    sleep 1;
done

In the next part of this tutorial series we will use these techniques and create some working programs.