A love song for Bash

There isn’t a day where I don’t see some kind of DevOps article that advocates some sort of higher order programming language for performing shell tasks, most recently one that uses Python to issue a POST request on a Discord Webhook API, when ever somebody logs into your server.

SSH Monitoring

The article, A SSH monitoring platform with Discord, makes this assumption that somehow issuing a HTTP API request with curl and bash is something that’s somehow worse than pulling several 10s of MBs of dependencies for a programming language you might not even know or use otherwise.

To give some figures for space efficiency here:

But why particularly don’t people use bash? The steps for this are easy:

  1. produce a JSON payload,
  2. issue a curl request

Producing a JSON payload

There’s not much you need to know in bash in order to do this. As it’s easier to work with files, all we need to do is read in a file with bash, and replace some tokens afterwards. Let’s create this ssh-login.json file here:

{
	"username": "SSH Login",
	"content": "User $USER just logged in on $HOSTNAME",
	"avatar_url": "https://placekitten.com/512/512"
}

As you can see from the JSON, all we need to do is replace $HOSTNAME and $USER. Let’s start with reading the file. As somebody who is familiar with bash, you might think that using cat is a good way to read a file, but we can use the built-in:

CONTENTS=$(<ssh-login.json)

If you resorted to cat, bash would spawn a sub-process. This way it’s more efficient and it will read the file directly into the variable. With this, we can move towards replacements.

CONTENTS=${CONTENTS/\$HOSTNAME/$HOSTNAME}
CONTENTS=${CONTENTS/\$USER/$USER}
echo "$CONTENTS"

We need to quote the variable when printing it, so it preserves white space in the output. Running the above gives us the following payload:

{
        "username": "SSH Login",
        "content": "User root just logged in on docker-s-2vcpu-2gb-nyc3-01",
        "avatar_url": "https://placekitten.com/512/512"
}

Note: The replacements shown are replacing only a single occurence. If you want to replace them all, you can add another / after the first slash:

-CONTENTS=${CONTENTS/\$HOSTNAME/$HOSTNAME} +CONTENTS=${CONTENTS//\$HOSTNAME/$HOSTNAME}

Issue a request to a discord webhook

When you have contents, issuing the request is brief to say the least:

curl -XPOST -d "$CONTENTS" $WEBHOOK_URL

In order to run it from sshrc, we’ll need to read the JSON from the correct path. There are two ways to get the full script path: readlink -f $0 or realpath -s $0, depending on which exists on your system (readlink is more common). The full script comes up to:

#!/bin/bash

# change to current directory
cd $(dirname $(readlink -f $0))

CONTENTS=$(<ssh-login.json)
CONTENTS=${CONTENTS//\$HOSTNAME/$HOSTNAME}
CONTENTS=${CONTENTS/\$USER/$USER}

curl -XPOST -d "$CONTENTS" $WEBHOOK_URL

To add it to your ssh, issue the following commands (as root, or under sudo su -):

echo '/path/to/ssh-login.sh &' >> /etc/ssh/sshrc
service sshd reload

When you next make a SSH login to your server, it will trigger a webhook notification on Discord. Without pyhon, and most likely, without any additional software, as most hosts come with bash and curl already installed.

Pipes and standard utilities

When you think about what people usually do on servers, you might remember all those times where you were tailing some log files and filtering it for relevant information. Knowing how to transform your data so that it fits your use cases is great, and for that you do need to handle at least the basic shell commands that allow you to do that:

  • grep, egrep - match or exclude specific patterns from input,
  • sort - sort your input based on sort rules,
  • uniq - produce unique lines from input, count frequency,
  • awk - filter input by columns,
  • sed - replacements of input texts,
  • find - find files or folders,
  • xargs - pipe input into other commands

Let’s give a few examples that will cover some of these:

Listing docker images

# docker images | grep '[0-9][0-9][0-9]MB' | awk '{print $NF " " $3 " " $1 ":" $2}' | sort -r | head -n 5
492MB f4189bb77f4f titpetric/nginx-php:<none>
475MB 9acfd3db2b7f titpetric/nginx-php:latest
458MB 6a3849d67915 titpetric/percona-xtrabackup:latest
449MB 3a0ee8ee86e1 nazarpc/phpmyadmin:latest
446MB e807aa9963b2 titpetric/nginx:latest
  • use grep with a shell expression to filter only 100-999MB containers,
  • use awk to print relevant columns in wanted order,
  • use sort to order the filtered data in reverse (descending) order,
  • use head to print only the first 5 lines

Of course, the snippet can be improved with better knowledge of the docker and sort commands:

# docker images --format '{{.Size}} {{.ID}} {{.Repository}}:{{.Tag}}' | sort -rh | head -n 5
492MB f4189bb77f4f titpetric/nginx-php:<none>
475MB 9acfd3db2b7f titpetric/nginx-php:latest
458MB 6a3849d67915 titpetric/percona-xtrabackup:latest
449MB 3a0ee8ee86e1 nazarpc/phpmyadmin:latest
446MB e807aa9963b2 titpetric/nginx:latest
  • use docker with --format to produce wanted output (replaces awk),
  • use sort with additional -h flag to compare human readable numbers (e.g., 2K 1G) - removes grep

Thanks to Sergey Fedchenko for keeping me honest.

Deleting older filesb based on disk usage

I have some servers which are used to host static content. Partially, some of this content is generated into a cache folder based on active use, and deleted often to keep some 10GB free on the servers. The clean up script shows an approach how to do that:

#!/bin/bash
date
DAYS=70
while (( DAYS > 45 )); do
        echo "Deleting older than $DAYS"
        find /var/www/static/default/public_html/cache/_up/ -type f -mtime +$DAYS -delete

        space=$(df / | grep sda | awk '{print $4}')
        echo "Available after delete $space"
        if (( space > 10000000 )); then
                break
        fi

        DAYS=$(($DAYS-1))
done

There are a few notable parts of this script:

  • find is used to find files older than 70..45 days and delete them
  • grep and awk to filter available space from df output
  • some basic math expressions from bash

Now, bash wasn’t really built for math. It doesn’t really have a standard library which would give you math fuctions, but it’s usable for simple counters like the example uses.

Timers

For those using crontab (and not something else), you might have a need to run scripts more often than once per minute. You can basically use bash to measure how much the first command run has taken, and then sleep for a required number of seconds before running the command again.

Now, bash doesn’t have a standard library that would give you date/time formatting functions. What it has is access to the filesystem and the ability to execute commands that can produce that. Since we’re trying to be efficient, this means that we would need to get the system time from /proc.

The first idea is to use /proc/uptime.

#!/bin/bash
read -a start <<< $(</proc/uptime)
echo $start

sleep 3

read -a stop <<< $(</proc/uptime)
echo $(($stop - $start))

But, after running the test, we realize that bash, while not great in with math to begin with, doesn’t have any concept of floating point math:

# ./test.sh
2918389.35
./test.sh: line 9: 2918392.36 - 2918389.35: syntax error: invalid arithmetic operator (error token is ".36 - 2918389.35")

But we can filter out the decimal point and fraction with sed.

#!/bin/bash
read -a start <<< $(</proc/uptime)
start=$(echo $start | sed -e 's/\..*//')
echo $start

sleep 3

read -a stop <<< $(</proc/uptime)
stop=$(echo $stop | sed 's/\..*//')

echo $(($stop - $start))

That’s just about the ugliest piece of bash code you can put together. If things are looking like that, there’s a good chance that this can be cleaned up and optimized. And sure enough, it can be:

#!/bin/bash
SECONDS=0
sleep 3
echo $SECONDS

What do you think this prints? Does it print 0 or 3?

What’s amazing (and completely understandable), is that a LOT of people don’t know about this little known bash feature. A poll which I did on twitter, had 80% of people answer incorrectly as to what the above output is, so don’t feel bad if you expected a zero there.

What I ended up using was:

#!/bin/bash
function command {
	docker exec -i some-container bash -c "cd $PWD; php index.php $(uname -n)"
}

SECONDS=0

command

DURATION=$((29 - $SECONDS))
if [ $DURATION -lt 25 ]; then
        sleep $DURATION
fi

command

In conclusion

If you’re doing mathematical operations, then bash won’t be the right tool for the job. But if you’re doing plain HTTP API requests, or orchestration, even parallelisation, then bash is a really powerful toolset which you can use. You don’t need to import python, php or node if all you need to do is replace some variables in a JSON payload and send it along to a web service.

The lack of a standard library features one might expect from more general-purpose programming languages is definitely a pitfall in bash. But to be honest, do you really need to do everything with it? No. But should you do everything in your language of choice? Most likely also no. If you can ask yourself “Can I do thing in bash?” the answer most likely is yes. It’s easier to add dependencies than cut them out, so the next time when you’re scripting something, give bash a chance. If it runs under in crontab, most likely it can be done in bash.

While I have you here...

It would be great if you buy one of my books:

I promise you'll learn a lot more if you buy one. Buying a copy supports me writing more about similar topics. Say thank you and buy my books.

Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.