System administration is in a sad state. It in a mess.
I’m not complaining about old-school sysadmins. They know how to keep systems running, manage update and upgrade paths.
This rant is about containers, prebuilt VMs, and the incredible mess they cause because their concept lacks notions of “trust” and “upgrades”.
Consider for example Hadoop. Nobody seems to know how to build Hadoop from scratch. It’s an incredible mess of dependencies, version requirements and build tools.
None of these “fancy” tools still builds by a traditional
command. Every tool has to come up with their own, incomptaible, and
non-portable “method of the day” of building.
And since nobody is still able to compile things from scratch, everybody just downloads precompiled binaries from random websites. Often without any authentication or signature.
NSA and virus heaven. You don’t need to exploit any security hole anymore. Just make an “app” or “VM” or “Docker” image, and have people load your malicious binary to their network.
The Hadoop Wiki Page of Debian is a typical example. Essentially, people have given up in 2010 to be able build Hadoop from source for Debian and offer nice packages.
To build Apache Bigtop, you apparently first have to install puppet3.
Let it download magic data from the internet.
Then it tries to run
sudo puppet to enable the NSA backdoors
(for example, it will download and install an outdated precompiled
JDK, because it considers you too stupid to install Java.)
And then hope the gradle build doesn’t throw a 200 line useless backtrace.
I am not joking. It will try to execute commands such as e.g.
/bin/bash -c "wget http://www.scala-lang.org/files/archive/scala-2.10.3.deb ; dpkg -x ./scala-2.10.3.deb /"
Note that it doesn’t even install the package properly, but extracts it to your root directory. The download does not check any signature, not even SSL certificates. (Source: Bigtop puppet manifests)
Even if your build would work, it will involve Maven downloading unsigned binary code from the internet, and use that for building.
Instead of writing clean, modular architecture, everything these days morphs into a huge mess of interlocked dependencies. Last I checked, the Hadoop classpath was already over 100 jars. I bet it is now 150, without even using any of the HBaseGiraphFlumeCrunchPigHiveMahoutSolrSparkElasticsearch (or any other of the Apache chaos) mess yet.
Stack is the new term for “I have no idea what I’m actually using”.
Maven, ivy and sbt are the go-to tools for having your system download unsigned binary data from the internet and run it on your computer.
And with containers, this mess gets even worse.
Ever tried to security update a container?
Essentially, the Docker approach boils down to downloading an unsigned binary, running it, and hoping it doesn’t contain any backdoor into your companies network.
Feels like downloading Windows shareware in the 90s to me.
When will the first docker image appear which contains the Ask toolbar? The first internet worm spreading via flawed docker images?
Back then, years ago, Linux distributions were trying to provide you with a safe operating system. With signed packages, built from a web of trust. Some even work on reproducible builds.
But then, everything got Windows-ized. “Apps” were the rage, which you download and run, without being concerned about security, or the ability to upgrade the application to the next version. Because “you only live once”.
Update: it was pointed out that this started way before Docker:
»Docker is the new ‘
curl | sudo bash‘«. That’s right,
but it’s now pretty much mainstream to download and run untrusted software
in your “datacenter”. That is bad, really bad. Before, admins would try hard
to prevent security holes, now they call themselves “devops” and happily
introduce them to the network themselves!