Skip to end of metadata
Go to start of metadata

What are Data Bags?

Data bags provide an arbitrary stores of globally available JSON data.

Data Bags are not directly associated with Node or Role attributes. When using Data Bags with a chef server, they are stored on the server and indexed for searching. When using Data Bags with chef-solo, data bags are stored in a directory hierarchy on the machine running chef-solo.

Recipes can load data bags directly or search a data bag for specific values similar to attributes in node indexes.

Store Data Bags In Source Control

If you clone the Chef Repository from Opscode, there is a directory called data_bags. You can create a directory for each bag, and put the JSON files for each item in the bag's directory. This will provide for more easily managing data bags in source control systems and in understanding and use. Access to your data bag structure can then be done through the knife command to load from file.

Data Bag From File

A individual data bag can refer to either a sub-directory of the chef-repo/data_bags directory or a .json file in that directory or one of the sub-directories. From the earlier example of admins:

Either create the new data bag item as a JSON file in the admins directory, or if the item already exists on the Chef Server, write to a file.

Or:

Here is an example of what a complete data bag directory structure could look like:

The directory structure is used by knife as a convenience so you don't have to type the full path when using the data bag from file sub-command.

The data bag itself provides a container of related items. For example, admins and db_users in your output would be separate bags. The "standard_packages" and "global_shell_settings" are things that might be under a "common" bag or whatever makes sense to you for grouping.

Each item in the bag gets its own JSON file. The only structural requirement of items is that they have an id.

Then when using the knife command to load the data bag item from the JSON file, you don't have to specify the path, just the name of the bag and the filename of the item. Knife's data bag from file automatically knows how to find the directory and the file under data_bags.

The command:

Loads the file:

(Note: the bag "BAG_NAME" must exist already.)

If you are not in the root directory for chef, you can also specify the path with this command:



Deploying from your private repo using Data Bags?

Don't overlook the deploy_key option. It's there for you to use to add the deployment private key to the data bag.

The key would be the one from your private git repository, and needs to be a string with the newlines converted to "\n"'s


Command Output Change in 0.10

Prior to version 0.10, knife data bag output was a JSON representation of the relavent information. JSON output can still be received in 0.10 and above by passing the -Fj command line option to knife.

Managing Data Bags With Knife

Data bags can be managed with the knife command line client. See Managing Data Bags With Knife to accomplish this.

Using Data Bags in Recipes

Data bags can be loaded by name using the DSL in a recipe, or can be accessed via the search indexes. When using a single data bag item, use the direct method whenever possible, as it avoids the search index entirely and so has lower overhead. On the other hand, if you plan to loop over all of the items in a data bag, using the search interface will cause Chef to bulk load the items, resulting in lower overhead.

Loading a Data Bag Directly

The recipe DSL provides access to data bags via the data_bag(BAG) method and access to data_bag_items via the data_bag_item(BAG, ITEM) method.

The data_bag method returns an array of the keys of the items that belong to the data bag. For the admins data bag shown in the above examples, we'd use the data_bag method like this:

To get the contents of the 'charlie' data bag item, we use the data_bag_item method. Note that each time you use data_bag_item in a recipe, Chef makes an API call to the server to load the item.

A More Complete Example

If we want to create a user on our systems for each admin in our admins data bag, we might do something like this:

Example Admins Recipe default.rb

Loading Data Bags through the Search Interface

In some situations you may not know which data bag items you want information from in advance, or you may want to load all of the items in the data bag. In these cases you can use the search index to find the data you want. When searching for data bag items, you provide the name of the data bag to search and the query string as arguments to the search(BAG, QUERY_STRING) method. Using the admins data bag, you can search for items based on any value, for example:

find every admin in the admins data bag
search for an admin with the id "charlie"
search for admins with a gid of "ops"
search for admins with an id beginning with the letter 'c'

Note that even though the search is returning the data bag items as Chef::DataBagItem objects, you can use these objects just like they were a Hash:

Use DataBagItem Objects Like Hashes

Here's the same "create a user for each admin" recipe as shown above, modified to load the admins using the search interface:

Example Admins Recipe default.rb

Creating and Editing Data Bags within a Recipe

Danger!
If two clients are simultaneously attempting to update a data bag, the data written last wins. This can lead to data loss if the user is not careful about multiple clients writing to data bags concurrently.

Altering data bags from the node when using the Open Source chef-server requires giving the node's API client admin privileges. In most cases, this is not advisable.

Please use this methods in this section with extreme care.

Data bags can be created within recipes using a variety of methods. The following examples describe how to create and edit data bag and data bag items within recipes by directly using the methods provided by the Chef::DataBag and Chef::DataBagItem objects.

Creating a Data Bag within a Recipe
Creating a Data Bag Item within a Recipe
Editing a Data Bag Item within a Recipe

Data Bags and Environments

Data Bags are global and can be accessed by nodes in any environment. For details on strategies that you can use to provide per-environment values in data bags, see Data Bags and Environments on the Environments page.

Using Data Bags with Chef Solo

Chef 0.10.4+ Only
Since 0.10.4 Data bags are available for use in Chef Solo as well as in the Chef Server.

Data bags can also be used by chef-solo. Chef solo loads data bags from a directory structure on the local file system. The location of the data bag directory is configurable with the data_bag_path configuration option in solo.rb. Each subdirectory within data_bag_path corresponds to a data bag. JSON files within each data bag directory corresponds to a data bag item. This is the same convention used by knife and described in the Data Bags from File section.

Since search is not available in recipes run with chef-solo, you must use the data_bag() and data_bag_item() functions to access data bags and items within a recipe.

Search using Chef Solo
A community member, Edelight, created a chef-solo search implementation for data bags. This functionality is provided via a cookbook that can be found on GitHub:

https://github.com/edelight/chef-solo-search







Cookbooks


Environments



Labels:
bag bag Delete
data data Delete
databags databags Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Mar 04, 2010

    Right (0.8.4) now only clients with Admin access can write to databags

    1. Mar 08, 2010

      This is different to the platform behavior

  2. Mar 31, 2010

    Very cool idea. I think this may help solve some of the issues I've had trying to using a finite set of node attributes to setup N resources on a node. (ex. multiple vhosts)

    Question: given a cookbook that uses databags – how do I know what to put into my bag? Right now it seems that performing a by-hand code inspection of the cookbook might be the only way?

  3. Apr 16, 2010

    I added bit of code to chef-repo Rakefile to dump and load data bags to chef-repo, with eye toward editing databags remotely. Unfortunately knife does not have data bag dump or data bag load commands, to had to improvise.

    If someone else find this useful, here is the link: http://gist.github.com/368740

  4. Jul 16, 2010

    Should changing a data bag used in a recipe (via search) cause the recipe to run on the next client execution?

    For example, nagios::server uses a data bag "users". Will changing the contents of users cause the nagios::server recipe to regenerate files?

    1. Sep 30, 2010

      Data Bags don't have any direct impact on the configuration on the machine, they're just, er... bags of data. So the important part for determining if changing a databag (item) will change the configuration on a host is whether that will change the data being fed to a resource in a recipe. For example, if your nagios recipe has a template that loops over all of the users to set up the contact groups, then adding or removing a user will cause Chef to update the template on the next run.

  5. May 11, 2011

    Any best practices for writing and read-modify-write databags from a recipe?

    There is the example of creating a new databag from the ebs_volume example:

    But how would I do an update to that?

  6. Oct 24, 2011

    I see that data bags are now supported in Chef Solo. What about encrypted data bags? Are these supported, and if so, are there any examples of how they would be used?