Bypassing docker network isolation (hack)
I am the author of the netdata docker image. I created
the image before netdata became the popular real time performance
monitoring software which it is today - over 15 thousand stars on GitHub! Wow! Due to the network
isolation however, you would have to run it in --net=host
mode to monitor your network devices.
The problem with network isolation
A positive feature which comes from network isolation is that the attack surface becomes very small,
if somebody manages to exploit a weakness in a running container. For example, if netdata was running
on the host network, and someone would manage to exploit it - they would have the capability of joining
botnets and disrupting operations of other people, but in the more destructive cases - they can literally
turn off your network interfaces with ifconfig [interface] down
. I don’t know, do we still have
passwords? I’ve used a SSH key to log into every server for at least a decade now and I’d be hard pressed
to guess what kind of password I set on any local user.
I guess in part it’s the point of single responsibility that makes me avoid putting netdata on the host
network. It’s meant to collect what it can from /sys
and /proc
filesystems and with read-only access
on those provide real-time insight into how your system is operating.
How the proc filesystem works
In effect, the proc filesystem is not an actual file system - it’s bits and bytes are not stored anywhere but
are retrieved from internal linux kernel structures whenever somebody opens a file. So when you open and
read /proc/net/dev
, what happens is that you’re really interfacing with the kernel, which returns the
data which was collected at that point. So, when netdata reads from the /proc
filesystem, it utilizes
the kernel to retrieve data from any allowed endpoints.
So, plainly put, we can read which processes are running on the host with SYS_PTRACE
and traversing
/proc/{pid}
structures, but when netdata tries to read anything under /proc/net/*
the read will be
rejected due to network isolation. It’s either a “you can do nothing” or “you can do everything” type of
world in this sense.
Workaround test 1
My first test with providing netdata with /proc/net
contents was optimistic. If you will notice, when
you list files under the proc filesystem, they are reported with a size 0, and when you actually read
them you will get data which is most likely longer then 0 bytes.
My first idea was just to run rsync
on the /proc/net
location and copy the files into another folder.
It works on paper, but, it actually doesn’t work. You get a folder with bunches of files which are size 0.
You should actually read a file, but rsync I suppose just sees “oh it’s size zero, i will read 0 bytes”.
I didn’t actually see what rsync does, but either way, I did guess it would be prohibitive from a system
viewpoint to run exec
to start up a new rsync process several times per second.
Workaround test 2
Obviously, you can do cat /proc/net/dev > /fakenet/dev
, as long as you create the fakenet folder beforehand.
We can find out what files are under /proc/net
with find -type f
, and then traverse them and cat/pipe them
into the new fake location.
Well, I did try it but there’s still the same problem as with rsync
- any kind of fake proc filesystem copy
would be load heavy, spawning a process multiple times per second.
Final full-bash solution
I created the final bash script that does a few things to optimize speed. Here are all the tricks explained;
OUTPUT="/dev/shm/fakenet/";
We are writing to /dev/shm/fakenet
location to optimize for input/output. SHM stands for shared memory, which
is where the contents are kept. Every time we’re writing a file there, we’re using only RAM.
SOURCES="/proc/net/dev
...
I list individual sources, sure, to avoid an exec call for find
, but also because the list of files which
netdata monitors is known from it’s source code. Because we know which files netdata expects, we can only
copy those files instead of the complete /proc/net
location and subfolders. Yay for using only what we need.
OUTFILE="${NETFILE:10}"
echo "$(<$NETFILE)" > $OUTPUT$OUTFILE
These are the two lines which I’m most proud of in the whole script.
${NETFILE:10}
- skips the first 10 characters of/proc/net/dev
, leaving $OUTFILE set asdev
,$(<$NETFILE)
- reads/proc/net/dev
and the double quotes around it keep the newline characters as-is,- As the
echo
command is internal to bash (and not equal to/bin/echo
) - there are noexec
calls in the code
I am amazed at how much bash code I write lately. And I’m fine with that. There’s an additional sleep 0.23
,
which will pause the synching for that amount of seconds, so the proc filesystem will be copied about 4 times
per second. Of course, this means that the accuracy of reading from the fake proc filesystem is not the same,
and the data may be delayed up to 250ms in the worst case. This will result in some jagged
graphs, but if it’s
a problem for you and you don’t mind some extra cpu cycles, you can decrease the sleep interval.
Putting it all together
As I saved the fakenet.sh script to github, we can download it and run it ourselves:
wget https://raw.githubusercontent.com/titpetric/netdata/master/fakenet.sh
chmod a+x fakenet.sh
nohup ./fakenet.sh >/dev/null 2>&1 &
We download it, we set it as executable, and we run it in the background (&) + add nohup
to keep it running
after we log out of the system. It’s a bit of a lowest common denominator, you can run the script under screen
if you like. As soon as you run the above, all you need to run is netdata
:
docker run --cap-add SYS_PTRACE \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
-v /dev/shm/fakenet:/fakenet/proc/net \
-p 19999:19999 --name netdata -d \
titpetric/netdata
You can then visit netdata by going to http://your-ip-or-host.name:19999/
. For more information about netdata
itself, there’s the longer readme on titpetric/netdata which you can
check out for more wisdom.
Caveat emptor
Obviously, reading from a partial copy of the proc filesystem is not exactly the same as reading from an actual
filesystem. Some data might still be missing (I’m missing net
under individual docker containers for example).
But at least I know what’s up with my eth0
without exposing too much. For the more initiated, I’d recommend
sticking with --net=host
in LAN networks, and at least think about how to protect netdata in the DMZ if running
on the host network. I guess it’s not very responsible to have it wide open to the internet, where everybody is screaming.
In the spirit of the Thanksgiving holiday I’m making a “black friday” deal for my book API Foundations in Go. The link includes a coupon taking 50% off your book purchase, bringing the minumum down to $5. The link is valid until November 26th, so hurry up and get it while you can.
While I have you here...
It would be great if you buy one of my books:
- Go with Databases
- Advent of Go Microservices
- API Foundations in Go
- 12 Factor Apps with Docker and Go
Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.
Want to stay up to date with new posts?
Stay up to date with new posts about Docker, Go, JavaScript and my thoughts on Technology. I post about twice per month, and notify you when I post. You can also follow me on my Twitter if you prefer.