I strongly believe in automating every single process that’s possible to automate. Humans doing the work of machines is a waste of precious resources. But recently I realized that as with all things, timing is everything. You can’t just set out to build an automated process from the go – to be successful, first you must iterate. When it comes to process, this means documenting and implementing manually from the documentation.
As obvious as this sounds, I’ve lost count of how many times I’ve seen people rush to choosing (and debating) tools before a process is even written down. I’m just as guilty – to professionals, process is often self evident and easy to take for granted.
This is especially true when you are crafting a technical process. Recently, my team developed a way to bootstrap heterogeneous cloud-based supercomputing assets at massive scale. This is typically a prime candidate for “scripting”, given it’s a repetitive programmatic process requiring little or no human intervention. There is very little unique configuration for each of these assets, and many can even be determined by querying the computer hardware itself. Rather than jump right into scripting, we documented the process first (in a “HOWTO” format), then implemented it manually again and again. We proved repeatability – one of the key goals of automation – testing, fixing, tweaking, and polishing as we went along. Finally, we arrived at the correct set of steps to script. Transcribing these steps into a machine-readable format is trivial, but determining and hardening them was not.
What did we learn and achieve? First, we really know what we’re doing, despite the fact the computer will eventually do it for us. We made all the adjustments necessary for every permutation we could think of, and are confident that the automation will “just work”. As a massive added bonus, we documented the process for posterity. No matter how good your automation is, it has limited potential without accurate documentation. In the blink of an eye, “tribal knowledge” can erode or disappear completely, and making future functionality enhancements can easily mean starting from scratch. Equally daunting is troubleshooting future failures without really understanding how the process works.
Could we have arrived at the same result had we jumped right into scripting? Isn’t testing and fixing a script somewhat the same thing as iterating through a manual process? While it can be effective, the additional overhead of building elaborate mechanisms specifically for automation can easily complicate what should be the simple task of verifying a series of steps. And in any case, we still would have had to document the process and its adjustments in order to preserve the knowledge for the long run. The way we did it, we even tested the documentation for accuracy as we went along!
Knowing what to automate goes hand in hand with knowing when to automate. Any time you develop process, especially technical process, it’s best to prove and mature it before you hand it off to the machines to execute. Not only will you end up with higher quality, but in many cases, you will also get there considerably faster.