Table of contents

What is Chef
Basic Chef Architectural Patterns
Basic Chef Concepts
Chef Two-Pass Model
- Load
- Compile
- Converge
- Cleanup

What is Chef

An enterprise-level configuration management system.
It can be described as a high-state configuration management system.
We define the desired state of the system in a declarative way.
Chef makes sure that the system converges to the desired state.
Other systems that follow the same pattern are:
- SaltStack
- Puppet

Basic Chef Architectural Patterns

Chef can be used in two ways:

Standalone mode (Chef Solo, Chef Zero)
Master/client mode

Chef Zero Mode

Uses an in-memory Chef server.
Is executed by calling the chef-zero utility.
Reads all configuration-related files from disk and loads them into memory.
After that, we can treat chef-zero as an in-memory Chef server.
Is used when we are restricted by networking or capacity and cannot have a proper Chef server.
Or when we would like to have a standalone, easily redistributable package that can auto-configure a system.

Chef Zero Mode (cont.)

This method can be used to create installers or standalone deployments (e.g., used by GitLab CE).
It is a lot simpler than using a fully fledged Chef Server setup.
Chef-zero executions do not respect the high-state convention, and there are no regular converges. The converge happens only during execution of chef-zero itself.

Chef Solo/Zero Execution

A chef-solo/zero execution is accomplished by the following script:

#!/usr/bin/env bash
chef-solo -j $1 -c solo.rb --recipe-url cookbooks.tgz

At this point, chef-solo is effectively the same as chef-zero. Chef-zero is the successor to chef-solo.

The script expects a JSON file as its argument that defines a run list:

{
  "run_list": [ "role[tp_redeploy_server]"]
}

Chef Server/Client Mode

The main use case of Chef.
An enterprise-level architecture consisting of one or more Chef servers and multiple Chef nodes. Chef supports a high-availability master architecture.
All of our infrastructure systems can and should be registered as Chef nodes.
Chef-client utility is installed on the nodes during Chef bootstrapping.

Chef Client Converge Cycle

Chef-client is executed regularly (every 30 minutes by default), assuring that the system is always in the desired state.
Each node is associated with a run list defining the state of the node.
At the end of the converge cycle, the full state of the system is returned to the Chef server.
Chef server holds all artifacts, cookbooks, roles, environments, policies, the state of all executions, and the state of all nodes.
During converge, chef-client will query the Chef server about the desired state and modify any resources that do not conform to the state definition.
At the end of the converge cycle, chef-client will submit the current state to the Chef server.

Basic Chef Concepts

One of the most basic Chef artifacts is a cookbook. It combines multiple other concepts such as recipes, resources, attributes, templates, etc.

Another basic Chef concept is the Chef workstation; we will provide examples for both.

Chef Cookbook

A cookbook is a collection of Chef resources. It has a specific filesystem layout, and a new one can be created by using the command:

knife cookbook create cookbook-name

We will explain the functionality of the knife command in more detail in the coming sections. We will explain the functionality of some of these files and folders shortly. Not all of them are necessary.

Chef Resource

A resource is an abstract concept. It is a statement of configuration policy that:

Is a Ruby-based DSL.
Describes the desired state for a configuration item.
Declares the steps needed to bring that item to the desired state.
Specifies a resource type—such as package, template, or service.
Lists additional details (also known as resource properties), as necessary.
Is grouped into recipes, which describe working configurations.

Chef LWRP - Lightweight Resources and Providers

A custom resource:

Is a simple extension of Chef that adds your own resources.
Is implemented and shipped as part of a cookbook.
Follows easy, repeatable syntax patterns.
Effectively leverages resources that are built into Chef and/or custom Ruby code.
Is reusable in the same way as resources that are built into Chef.
Its code is saved in the resources and providers folders of a cookbook.
When a custom resource is lightweight, it does not implement both resource and provider. More often than not, it only implements a resource as a wrapper around another resource.

Chef Recipe

A recipe consists of a collection of resource blocks. The recipes are saved in the recipes directory of the cookbook. They can use either built-in Chef resources or custom resources:

log 'create yum-cron config file'
cookbook_file '/etc/sysconfig/yum-cron' do
  mode '0644'
  owner 'root'
  source 'yum-cron-sysconfig-config'
end

log 'enable and start yum-cron service'
service 'yum-cron' do
  action [:enable, :start]
end

This recipe will create the configuration of yum-cron in a CentOS system and then enable and start the service.

Chef Data Bag

A data bag is a global variable that is stored as JSON data and is accessible from a Chef server. A data bag is indexed for searching and can be loaded by a recipe or accessed during a search.

Data bags are used to provide runtime, environment-specific information that recipes and integration scripts can access (e.g., passwords, keys, etc.). Although data bags are global, they can be used to store environment-specific information.

Data bag searches are not supported when Chef is executed in isolation using chef-solo. Chef-solo can load a data bag from a disk file, though. This is one of the differences between chef-zero and chef-solo.

Chef Attributes

Chef attributes are defined in a file called default.rb inside the attributes directory. These are the default values of the recipe attributes that can be used from the recipes and can be overridden during runtime from a number of different places.

default[:jenkins][:mirror] = 'http://mirrors.jenkins-ci.org'
default[:jenkins][:package_url] = 'http://pkg.jenkins-ci.org'
default[:jenkins][:java_home] = ENV['JAVA_HOME']

default[:jenkins][:server][:home] = '/var/lib/jenkins'
default[:jenkins][:server][:user] = 'jenkins'

case node[:platform]
when 'debian', 'ubuntu'
  default[:jenkins][:server][:group] = 'nogroup'
else
  default[:jenkins][:server][:group] = node[:jenkins][:server][:user]
end

Chef Templates

A cookbook template is an Embedded Ruby (ERB) template that is used to dynamically generate static text files. Templates may contain Ruby expressions and statements and are a great way to manage configuration files. We can use the template resource to add cookbook templates to recipes. Template files are placed in a cookbook’s /templates directory.

template '/etc/security/limits.conf' do
  source 'limits.erb'
  mode '0644'
  owner 'root'
  group 'root'
end

# nginx jenkins application vhost
#
<% if @www_redirect -%>
server {
<% @listen_ports.each do |port| -%>
  listen <%= port %>;
<% end -%>
  server_name www.<%= @host_name %>;
  rewrite ^/(.*) http://<%= @host_name %>/$1 permanent;
}
<% end -%>
server {
<% @listen_ports.each do |port| -%>
  listen <%= port %>;
<% end -%>
  server_name <%= @host_name %><% @host_aliases.each do |a| %> <%= a %><% end %>;
  client_max_body_size <%= @max_upload_size %>;
  location / {
    proxy_pass http://127.0.0.1:<%= node[:jenkins][:server][:port] %>;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
  }
  error_log <%= node[:nginx][:log_dir] %>/jenkins-error.log;
  access_log <%= node[:nginx][:log_dir] %>/jenkins-access.log;
}

Chef Cookbook Dependencies Management

We can use resources defined in different cookbooks. This approach creates a web of dependencies between cookbooks, as we would expect in any Java or C# project. Manual management of these dependencies is quite difficult, so the community developed a tool that resolves and downloads the dependencies in an automated way. That tool is called Berkshelf.

Cookbook dependencies must be defined in its metadata file; otherwise, they cannot be referenced in recipes:

name             'di_jenkins'
maintainer       'Nick Apostolakis'
maintainer_email '[email protected]'
license          'Apache 2.0'
description      'Installs and configures Jenkins CI server & slaves'
long_description IO.read(File.join(File.dirname(__FILE__), 'README.md'))
version          '0.1.39'

recipe 'install_di_non_prod_repos', 'install the di non prod repos in the jenkins slave'
recipe 'secondary_jenkins_users', 'create the jenkins users'
recipe 'activate_yum_cron', 'activate yum_cron'
recipe 'create_sbt_setup', 'install sbt on the slave'

%w(java perl).each do |cb|
  depends cb
end

Chef Run List

A run list defines all of the information necessary for Chef to configure a node into the desired state. A run list is:

An ordered list of roles and/or recipes that are run in the exact order defined in the run list; if a recipe appears more than once in the run list, the chef-client will not run it twice.
Always specific to the node on which it runs; nodes may have a run list that is identical to the run list used by other nodes.
Stored as part of the node object on the Chef server.
Maintained using knife, and then uploaded from the workstation to the Chef server, or maintained using the Chef management console.

Run lists are used:

In node definitions.
In role definitions.
In environment definitions.
In policy definitions.

Chef Workstation

A workstation is a computer that is used to author cookbooks. In order to interact with the Chef server and nodes, it needs the Chef Development Kit (ChefDK) installed and a set of directories with a specific structure.

The workstation is the location from which most users do most of their work, including:

Developing and testing cookbooks and recipes.
Testing Chef code.
Keeping the chef-repo synchronized with version source control.
Configuring organizational policy, including defining roles and environments, and ensuring that critical data is stored in data bags.
Interacting with nodes as (or when) required, such as performing a bootstrap operation.

Chef Roles

A role is a way to define certain patterns and processes that exist across nodes in an organization as belonging to a single job function.

Each role consists of zero (or more) attributes and a run list. Each node can have zero (or more) roles assigned to it.

When a role is run against a node, the configuration details of that node are compared against the attributes of the role, and then the contents of that role’s run list are applied to the node’s configuration details.

When a chef-client runs, it merges its own attributes and run lists with those contained within each assigned role.

Chef Environment

An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef server.

Every organization begins with a single environment called the _default environment, which cannot be modified (or deleted). Additional environments can be created to reflect each organization’s patterns and workflow.

Chef Policies

Policies combine the best parts of roles, environments, and client-side dependency resolvers such as Berkshelf into a single, easy-to-use workflow.
By using policies, we can apply a specific set of cookbooks to a node or nodes with a single document.
Policies are versioned and can be applied per environment.
An alternative to policies that people with large deployments are using is the environment cookbook pattern. We will discuss it in the future.
In a larger setup, using policies is the way forward.

Chef Two-Pass Model

Chef converge does not take place in one step. It is separated into a number of steps:

Load
Compile
Converge
Cleanup

Load

During the load phase, Chef-client syncs all the needed cookbooks with the Chef server if one is being used. Each of the five types of support files is loaded in order:

libraries/
attributes/
resources/
providers/
definitions/

Compile

The compile phase is the first of the execution phases. The goal of the compile phase is to go from recipe source code to in-memory representations of resource objects.

At this point, chef-client takes the node’s run list and fully expands it, so any roles are replaced by their constituent recipes until all we have is an ordered list of recipes to run.

Chef-client will run each of those recipes in order, and if no errors are raised, it will move on to the next phase.

Depending on the syntax used in some recipes, some pieces of code will be evaluated during the compile phase, and others during the converge phase. Usually, pure Ruby code is executed during the compile phase. Chef DSL code is executed during the converge phase. This is one of the most common sources of confusion when coding with Chef.

Converge

Once the compile phase completes, we have a resource collection that is fully loaded and ready to go. This is an array of resource objects that represent the data from our recipes. The majority of the converge phase can be seen as:

resource_collection.each do |resource|
  resource.run_action(resource.action)
end

There is a little more complexity to deal with things like notifications, but that is the heart of it: loop over each resource and run the requested action. This is where provider classes get used; run_action creates a provider instance internally and runs the action code from the provider. Those methods are what actually do all the interesting things like writing files, installing packages, etc.

Cleanup

With the compile phase finished, we just have a few cleanup steps left to process. This includes things like running handler plugins, saving the node state back up to the Chef server, and sending data to the Chef Analytics server if being used.