Configuration management is the management of the running state of your servers. At least, it is supposed to be. With cloud computing becoming a bit more widespread, a revitalization of the “golden image” concept has occurred. Let’s take a look at what configuration management is supposed to be, how it has evolved, and how it has gone horribly wrong.
When done properly, your servers can be remarkably stable. With a proper configuration management system, changes will be first tested on development servers. Then, if everything goes well, the changes will be promoted to production servers—and this is key—without any manual intervention. The same settings, not subject to human interaction, need to propagate unadulterated. This is easily accomplished with tools like Puppet for Linux/Unix based systems, but is impossible elsewhere.
Linux/Unix Server Utopia
Long ago people used to write scripts, on the fly, on the command line to ssh (or rsh if we’re being literal with “long ago”) into a group of servers and run some commands. Things would get done, but a problem emerged. As changes started being performed on various servers, nobody really knew what the state of these servers was. An operating system or application update would not make its way to every server in the same predictable way, because each server was different. The solution is what ITIL broadly refers to as the Configuration Management DataBase (CMDB).
Fast forward to, say, the year 2000, when the world really began experimenting with Cfengine. It was pretty brute force: you told it what configuration files to push out to what servers, and spent inordinate amounts of time updating and maintaining it. As cumbersome as Cfengine is, it solved the problem. Any unauthorized change would be automatically reverted, it was self-documenting, and you could ensure your entire infrastructure was running what you thought it was.
Fast forward again, to the year 2004. Along comes Puppet with a whole new way to think about configuration management. Instead of manually coding exceptions for each operating system you ran, let the tool handle it. Instead of pushing out configuration files, let Puppet manage the resource based on your OS’s supported method. For example, say you want to ensure that a user exists on all your servers. Instead of telling Cfengine about the format of the passwd, shadow, possibly passwd db, and group files, just tell Puppet to create the user—it knows how those standard things work in the various operating systems.
Regardless of the particular configuration management paradigm, configuration management is the only way to know what the state of your running servers is.
In Windows, Everyone Hears You Scream
Microsoft’s System Center Configuration Manager is about as good as it gets. Aspects of the operating system can be managed fairly well, especially with Active Directory. But in general, the Windows world operates by pushing out changes and then forgetting about them. Sure, MSI packages can be built that contain the desired configuration for a specific application along with that application, but nothing ensures those settings don’t get changed afterward. It’s worse than that, too, in that nothing documents changes to the applications after administrators change them. There is basically no way to know what the running configuration is (of the applications and services that run) on a set of Windows servers.
So what can you do? Nothing much; the Windows server world has gotten along with installing applications and manually logging into servers afterward to administer them. A future global change will affect every server differently, since every server is different. In Unix/Linux land you rarely need to, aside from troubleshooting or verifying, login and poke around. This is not possible in the Windows world because each application or service generally does not lend itself to being configured via a text file or Active Directory. These stray applications, which may store their setting in the registry or proprietary database files strewn throughout the file system, is why Windows server configuration management is impossible.
Virtualization and the Scourge of &Ldquo;golden Images”
In the absence of well-established best practices in many organizations, the onset of virtualization brought with it yet another a method to madness. Companies sold the idea of a golden image, which is literally a disk image of a pre-configured operating system. On the surface, it sounds good. You can boot many servers and automatically change their IP and hostname. They all booted from a known “golden” image, so they are all the same. That is, until you boot them and they become different. People even go so far as to create many (many, many) golden images for each type of server they wish to run.
The main problems with golden images are two-fold. First you to maintain many images and update them whenever a change is made. Updating images is laborious, and keeping track of how each is configured is impossible. Second, how do you get a change pushed out to a server if your only configuration management is the golden image? You reboot it.
There is nothing wrong with using a golden image to boot or install servers if they subsequently run Puppet and get the bulk of their configuration. In this use-case, you have one image per hardware platform, and they are essentially a replacement for a kickstart or jumpstart server.
The network world has also struggled with this issue. It’s not as bad, because network gear isn’t nearly as complex as a server. Its entire configuration is usually held within one text file. It should be quite easy to manage then, right? You’d think so, but nothing really exists. Cisco’s tools are cumbersome and horribly expensive, and most people just end up saving a copy of the configuration file on a TFTP server. Neat tools like RANCID (Really Awesome New Cisco confIg Differ) exist, which will show you when a change has occurred, but do not ensure the configuration state.
Interestingly, one company has taken on the problem of cross-vendor network gear configuration management. Orion, with their NCM product, promises to ease the burden. Next time, we will give an overview of the Orion Network Configuration Management system.
Charlie Schluting is the author of Network Ninja, a must-read for every network engineer.