For the last few years I have had an interest in configuration management of IT infrastructure. While by no means an expert I have a considerable amount of experience with the problems associated with mass server configuration and have come to believe it also one of the most under-served disciplines in systems management.
In a previous life I had an operations role maintaining primarily Linux servers and other open source infrastructure. In 2006 I worked on launching the open source NetDirector project, a graphical tool for configuring open source infrastructure like Apache, Samba, LDAP and NFS servers. During that time the challenges of maintaining server configurations started to really come to light for me.
The Challenges of Server Configuration
A large part of configuring infrastructure is repetitive and time consuming. Many sysadmins rely on their own scripts to help manage the process but it’s still a one-off for each administrator and their individual knowledge seldom if ever gets institutionalized throughout the entire organization.
Having a framework for maintaining configuration data is important. Configuration scripts are often authored as shell scripts or Perl or even Python but are seldom maintained or used beyond the original author. Some technologies use plugins that could be used to abstract configuration variables, users and systems then shared among users of the technology but plugins are often specific to a given technology and non-transferrable.
Despite the breadth of this problem there are relatively few solutions that can easily be consumed by medium-sized enterprises outside of large management suites available from the Big Four (HP, IBM, CA and BMC). The independent software vendors have all been consumed by bigger less focused organizations including Opsware (acquired by HP), Bladelogic (acquired by BMC) and Configuresoft (acquired by by VMware). In my opinion there is no real leader in this space.
In a conversation with Opscode CEO, Jesse Robbins, he shared his experience maintaining availability for web properties at Amazon.com. As a top ops guy and “master of disaster” at Amazon.com he had no access to these tools, they were simply sold in a way that was inconsistent with the way he evaluated and consumed products and services. Opscode, a relatively new company develops the open source Chef project, which automates IT management via a client-server platform.
Opscode’s approach to server configuration challenges is to use recipes written in Ruby, the chef domain specific language(DSL). Then these cookbooks can be executed securely by the Chef client-server architecture and finally Chef is available as open source software to download, use and redistribute. In a nutshell Opscode met the following criteria is what interested me about their technology which is relatively easy to use, share configuration recipes and consume.
Chef, The Open Source Project
Chef is a systems integration framework released under the Apache License Version 2.0. Chef, can manage servers by writing code in Ruby stored in configuration recipes called cookbooks. Chef can integrate with existing infrastructure like LDAP via libraries using arbitrary Ruby code, either to extend Chef’s or to implement custom classes. Users can also configure applications that have dependencies on other parts of the infrastructure like databases and discern that information via the Chef server. However, I like Robbins’ description of Chef — sysadmin robot performing configuration tasks automatically and much more quickly than a single admin could ever hope to.
Though Chef was only released on January 15th , 2009 it has gotten rapid adoption and gained a large number of contributors. According to the Opscode wiki there are 157 approved contributors to Opscode projects and well over 20 companies. Beyond that the #chef IRC channel is typically attended by over 100 users and Opscode staff, signs of a healthy, growing open source community.
Opscode, The Platform
The Opscode Platform is the commercial offering from Opscode Inc. It is a centrally managed data store hosted by Opscode into which servers publish data such as IP addresses, loaded kernel modules, OS versions and more delivered as infrastructrue-as-a-service (IaaS). This data on the Opscode Platform can be accessed and becomes useful in the following ways:
- Search-based Automation: All the data collected by the Opscode platform is indexed and searchable. Users can dynamically query this data from within Chef recipes to configure services that require complex configuration.
- Role-based Access Control: The data index is has an access control system enabling administrators to centrally manage the level of infrastructure access.
- Portability: The data stored on the Opscode Platform serves as a virtual blueprint of a given infrastructure, making it much easier to create perfect clones of a production environment.
The Opscode Platform is in a free beta release for the next 60 days. After the trial period, participants can manage up to 20 nodes on the Platform for $50 per month and $5 per month for each additional node. Pricing and availability information is available on their website.
Opscode, The Company
Opscode was founded in 2008 by Jesse Robbins and Adam Jacob, both experienced web operations leaders. Since then they have recruited a top notch team including cloud and systems management expert John Willis, Adrian Cole leader of the jclouds project and Christopher Brown a Founding Member, Architect, and Lead Developer for Amazon.com’s Elastic Compute Cloud (EC2).
Opscode today announced that they closed an $11 million Series B round of funding. The round was led by Battery Ventures (who also was an investor in BladLogic before it was acquired by BMC) and includes a follow-on investment from Draper Fisher Jurvetson(DFJ) whose other open source investments include SugarCRM and Fonality. DFJ led Opscode’s Series A round of funding of $2.5 million, bringing the total amount raised for the company to $13.5 million, a sizable amount of capital to bring this technology to market.
Opscode is also seeing good adoption of Chef, not only do they have a few thousand active users on their wiki plus chef is currently in production at numerous top websites, including 37Signals, Etsy, IGN Entertainment, Scribd, and Wikia. Not only are web jockeys using chef but other large infrastructure providers are contributing to the project. Engine Yard, Rackspace, RightScale and the Springsource division of VMware have signed on to contribute to the project. They are even being very public about it as seen in this endorsement:
“We are excited about the open source contributions the Springsource Division of VMware has made to Opscode Chef.” said Javier Soltero, CTO of Springsource Management Products at VMware. “Chef is an important tool for automating infrastructure management and we look forward to its continued growth and success.”
Making Sysadmins into Superheroes
Opscode Chef is a hugely powerful tool that can greatly amplify the knowledge an effectiveness of systems administrators by automating a significant number of their maintenance tasks, improving their productivity and allowing them to focus on higher value tasks. Not only does Chef provide a framework for building systems but repairing them, keeping availability high and time to resolution low. This gives IT professionals a lot of leverage in getting their tasks done, allowing them to solve a problem once and then automate the process going forward. In other words Chef can turn systems administrators into super heroes by vastly improving their productivity and overall quality fo service.
The need that Opscode addresses can be filled to some degree through other software. Cfengine, Puppet and bcfg2 are all open source software solutions that address server configuration needs and have been around for some time. As mentioned above there is also large management suites that handle the same problems though they are expensive and have their own limitations. What is unique about Opscode approach is that they offer a robust, fully featured software platform as open source and a commercial offering that has full compatibility with the open source project.
This is somewhat unique as many commercial open source projects have a specific feature set that are only available to their enterprise customers. In a conversation with Opscode VP of Service, John M. Willis we discussed those users who are not interested in the Opscode platform but still want commercial support. He said that Opscode is building a select high-quality partner network that can handle these requests. Most recently Opscode has announced a partnership with DTO Solutions who employ members of the Control Tier project and is a big proponent of the DevOps approach to infrastructure management. Other partners will soon be onboard as well.
I am very much a fan of Opscode and there approach though my description probably doesn’t do it justice. Theoretically, a systems administrator who successfully implements automation tools such as Opscode can improve not only their productivity but the uptime of servers by improving speed to resolution for outages. Opscode has a great opportunity thanks to a talented team, a novel go-to-market plan and a real need for these types of tools among IT professionals.
- The Origins of Amazon’s Cloud Computing (GigaOm.com)
- Chef Wiki (opscode.com)
- Understanding cloud and devops–part 1 (news.cnet.com)
- 3 Companies That Tackle Complexity in the Cloud (readwriteweb.com)
- Opscode Unveils Funding, Hosted Platform (datacenterknowledge.com)
- Opscode Gets $11M to Take on IBM and HP Management Software (gigaom.com)