A love song for Bash
There isn’t a day where I don’t see some kind of DevOps article that advocates some sort of higher order programming language for performing shell tasks, most recently one that uses Python to issue a POST request on a Discord Webhook API, when ever somebody logs into your server.
SSH Monitoring
The article, A SSH monitoring platform with Discord, makes this assumption that somehow issuing a HTTP API request with curl and bash is something that’s somehow worse than pulling several 10s of MBs of dependencies for a programming language you might not even know or use otherwise.
To give some figures for space efficiency here:
- Bash and Curl come up to 4MB (spotify/alpine)
- Python 3 comes up to 29MB (python:3-alpine)
But why particularly don’t people use bash? The steps for this are easy:
- produce a JSON payload,
- issue a curl request
Producing a JSON payload
There’s not much you need to know in bash in order to do this. As it’s easier to work with files, all we need to do is read in a file with bash, and replace some tokens afterwards. Let’s create this ssh-login.json file here:
{
"username": "SSH Login",
"content": "User $USER just logged in on $HOSTNAME",
"avatar_url": "https://placekitten.com/512/512"
}
As you can see from the JSON, all we need to do is replace $HOSTNAME and $USER. Let’s start with
reading the file. As somebody who is familiar with bash, you might think that using cat
is a good
way to read a file, but we can use the built-in:
CONTENTS=$(<ssh-login.json)
If you resorted to cat
, bash would spawn a sub-process. This way it’s more efficient and it will
read the file directly into the variable. With this, we can move towards replacements.
CONTENTS=${CONTENTS/\$HOSTNAME/$HOSTNAME}
CONTENTS=${CONTENTS/\$USER/$USER}
echo "$CONTENTS"
We need to quote the variable when printing it, so it preserves white space in the output. Running the above gives us the following payload:
{
"username": "SSH Login",
"content": "User root just logged in on docker-s-2vcpu-2gb-nyc3-01",
"avatar_url": "https://placekitten.com/512/512"
}
Note: The replacements shown are replacing only a single occurence. If you want to replace them all, you can add another
/
after the first slash:
-CONTENTS=${CONTENTS/\$HOSTNAME/$HOSTNAME}
+CONTENTS=${CONTENTS//\$HOSTNAME/$HOSTNAME}
Issue a request to a discord webhook
When you have contents, issuing the request is brief to say the least:
curl -XPOST -d "$CONTENTS" $WEBHOOK_URL
In order to run it from sshrc
, we’ll need to read the JSON from the correct path. There are two
ways to get the full script path: readlink -f $0
or realpath -s $0
, depending on which exists
on your system (readlink is more common). The full script comes up to:
#!/bin/bash
# change to current directory
cd $(dirname $(readlink -f $0))
CONTENTS=$(<ssh-login.json)
CONTENTS=${CONTENTS//\$HOSTNAME/$HOSTNAME}
CONTENTS=${CONTENTS/\$USER/$USER}
curl -XPOST -d "$CONTENTS" $WEBHOOK_URL
To add it to your ssh, issue the following commands (as root, or under sudo su -
):
echo '/path/to/ssh-login.sh &' >> /etc/ssh/sshrc
service sshd reload
When you next make a SSH login to your server, it will trigger a webhook notification on Discord.
Without pyhon, and most likely, without any additional software, as most hosts come with bash
and
curl
already installed.
Pipes and standard utilities
When you think about what people usually do on servers, you might remember all those times where you were tailing some log files and filtering it for relevant information. Knowing how to transform your data so that it fits your use cases is great, and for that you do need to handle at least the basic shell commands that allow you to do that:
grep
,egrep
- match or exclude specific patterns from input,sort
- sort your input based on sort rules,uniq
- produce unique lines from input, count frequency,awk
- filter input by columns,sed
- replacements of input texts,find
- find files or folders,xargs
- pipe input into other commands
Let’s give a few examples that will cover some of these:
Listing docker images
# docker images | grep '[0-9][0-9][0-9]MB' | awk '{print $NF " " $3 " " $1 ":" $2}' | sort -r | head -n 5
492MB f4189bb77f4f titpetric/nginx-php:<none>
475MB 9acfd3db2b7f titpetric/nginx-php:latest
458MB 6a3849d67915 titpetric/percona-xtrabackup:latest
449MB 3a0ee8ee86e1 nazarpc/phpmyadmin:latest
446MB e807aa9963b2 titpetric/nginx:latest
- use
grep
with a shell expression to filter only 100-999MB containers, - use
awk
to print relevant columns in wanted order, - use
sort
to order the filtered data in reverse (descending) order, - use
head
to print only the first 5 lines
Of course, the snippet can be improved with better knowledge of the docker
and sort
commands:
# docker images --format '{{.Size}} {{.ID}} {{.Repository}}:{{.Tag}}' | sort -rh | head -n 5
492MB f4189bb77f4f titpetric/nginx-php:<none>
475MB 9acfd3db2b7f titpetric/nginx-php:latest
458MB 6a3849d67915 titpetric/percona-xtrabackup:latest
449MB 3a0ee8ee86e1 nazarpc/phpmyadmin:latest
446MB e807aa9963b2 titpetric/nginx:latest
- use
docker
with--format
to produce wanted output (replaces awk), - use
sort
with additional-h
flag to compare human readable numbers (e.g., 2K 1G) - removes grep
Thanks to Sergey Fedchenko for keeping me honest.
Deleting older filesb based on disk usage
I have some servers which are used to host static content. Partially, some of this content is generated into a cache folder based on active use, and deleted often to keep some 10GB free on the servers. The clean up script shows an approach how to do that:
#!/bin/bash
date
DAYS=70
while (( DAYS > 45 )); do
echo "Deleting older than $DAYS"
find /var/www/static/default/public_html/cache/_up/ -type f -mtime +$DAYS -delete
space=$(df / | grep sda | awk '{print $4}')
echo "Available after delete $space"
if (( space > 10000000 )); then
break
fi
DAYS=$(($DAYS-1))
done
There are a few notable parts of this script:
find
is used to find files older than 70..45 days and delete themgrep
andawk
to filter available space fromdf
output- some basic math expressions from bash
Now, bash wasn’t really built for math. It doesn’t really have a standard library which would give you math fuctions, but it’s usable for simple counters like the example uses.
Timers
For those using crontab (and not something else), you might have a need to run scripts more often than once per minute. You can basically use bash to measure how much the first command run has taken, and then sleep for a required number of seconds before running the command again.
Now, bash doesn’t have a standard library that would give you date/time formatting functions. What it has
is access to the filesystem and the ability to execute commands that can produce that. Since we’re trying
to be efficient, this means that we would need to get the system time from /proc
.
The first idea is to use /proc/uptime
.
#!/bin/bash
read -a start <<< $(</proc/uptime)
echo $start
sleep 3
read -a stop <<< $(</proc/uptime)
echo $(($stop - $start))
But, after running the test, we realize that bash, while not great in with math to begin with, doesn’t have any concept of floating point math:
# ./test.sh
2918389.35
./test.sh: line 9: 2918392.36 - 2918389.35: syntax error: invalid arithmetic operator (error token is ".36 - 2918389.35")
But we can filter out the decimal point and fraction with sed.
#!/bin/bash
read -a start <<< $(</proc/uptime)
start=$(echo $start | sed -e 's/\..*//')
echo $start
sleep 3
read -a stop <<< $(</proc/uptime)
stop=$(echo $stop | sed 's/\..*//')
echo $(($stop - $start))
That’s just about the ugliest piece of bash code you can put together. If things are looking like that, there’s a good chance that this can be cleaned up and optimized. And sure enough, it can be:
#!/bin/bash
SECONDS=0
sleep 3
echo $SECONDS
What do you think this prints? Does it print 0 or 3?
What’s amazing (and completely understandable), is that a LOT of people don’t know about this little known bash feature. A poll which I did on twitter, had 80% of people answer incorrectly as to what the above output is, so don’t feel bad if you expected a zero there.
What I ended up using was:
#!/bin/bash
function command {
docker exec -i some-container bash -c "cd $PWD; php index.php $(uname -n)"
}
SECONDS=0
command
DURATION=$((29 - $SECONDS))
if [ $DURATION -lt 25 ]; then
sleep $DURATION
fi
command
In conclusion
If you’re doing mathematical operations, then bash won’t be the right tool for the job. But if you’re doing plain HTTP API requests, or orchestration, even parallelisation, then bash is a really powerful toolset which you can use. You don’t need to import python, php or node if all you need to do is replace some variables in a JSON payload and send it along to a web service.
The lack of a standard library features one might expect from more general-purpose programming languages is definitely a pitfall in bash. But to be honest, do you really need to do everything with it? No. But should you do everything in your language of choice? Most likely also no. If you can ask yourself “Can I do thing in bash?” the answer most likely is yes. It’s easier to add dependencies than cut them out, so the next time when you’re scripting something, give bash a chance. If it runs under in crontab, most likely it can be done in bash.
While I have you here...
It would be great if you buy one of my books:
- Go with Databases
- Advent of Go Microservices
- API Foundations in Go
- 12 Factor Apps with Docker and Go
Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.
Want to stay up to date with new posts?
Stay up to date with new posts about Docker, Go, JavaScript and my thoughts on Technology. I post about twice per month, and notify you when I post. You can also follow me on my Twitter if you prefer.