Skip to end of metadata
Go to start of metadata

Overview

Search is a feature of the Chef Server that allows you to use a full-text search engine (based on Apache Solr) to query information about your infrastructure and applications.

Searches are built by the Chef Server, and allow you to query arbitrary data about your infrastructure. You can utilize this service via search calls in a recipe or the knife search command.

Most data that Chef stores is automatically indexed in Solr including Data Bags, API Clients, Nodes, and Roles.

Search Index Names

Chef's built-in types are indexed in the following search indexes (don't worry about the search syntax for now):

Data Type Index Name Example Knife Search
Roles role knife search role "name:production*"
Nodes node knife search node "name:app*"
API Clients client knife search client "name:c*"
Environments environment knife search environment "*:*"

Data bags are indexed by the data bag's name. For example, to search for items in a data bag named 'bag_o_data' in knife, use a query like:

The 'client' index is currently affected by a bug and will return incorrect results.

Query Syntax

Queries have the form "field:search_pattern" where "field" is a key in the JSON description of the objects being searched. The search pattern supports exact, wildcard, range, or fuzzy matching. The field name has limited support for wildcard matching.

Note: Both the field and search pattern is case sensitive.

Field Name Syntax

Field names are keys in the JSON description of the object being searched.

To search in a nested key, for example dmi / system / product_name, insert an underscore "_" between key names:

Note that wildcard use with nested field has changed between version 0.9 and 0.10. See Nested Fields below for details.

Discovering Key Names

The definitive list of search keys are the keys in the JSON description of the object being searched. To see the keys available when searching nodes:

This command will open a full JSON description of the node "staging" in the pager less. Similarly knife data bag show, knife client show, and knife role show can be used to find keys available for their respective objects. The json keys are the definitive search keys to be used in any search context (knife, recipe, Hosted Chef, etc).

Wildcard Matching for Field Names

The field name also has limited support for wildcard matching. Both the "*" and "?" wildcards (see below) can be used within a field name; however, they cannot be the first character of the field name. For example, the following are valid queries:





Tutorials from the Community


Community member Christian Paredes put together a blog post on using Search in Recipes to gather information on the database nodes and then to take action based upon that defined criteria.

Exact Matches

To search for an exact match in a field, use a query of the form "field:TEXT_TO_MATCH". Be sure to quote any search patterns with spaces using double quotes. You'll still need to quote the entire query to prevent your shell or ruby from trying to interpret it. The best way to do this is to quote your query with single quotes and the search pattern with double quotes, as shown in the second example below.

Examples:
Search for a data bag item in the 'admin' data bag with the exact value of 'charlie' for the key 'id':

EXAMPLE RESULT:

Search for a data bag item with the exact text 'Charlie the Unicorn' for the key 'comment':

EXAMPLE RESULT:

Wildcard Matches

Change to Query Syntax in Chef 0.10
Prior to Chef 0.10, search queries could not include the wildcards '*' or "?" as the first character of the search pattern.

In version Chef 0.10 and beyond, the "*" wildcard can now be used as the first character of a search pattern (but not a field/key name). "?" still cannot be used as the first character of the search pattern.

In Chef 0.10+, the following queries will return any node in which the key foo exists:

This same behavior can be achieved in 0.9 with the following query:

Chef searches can use the familiar * (asterisk) and ? wildcard symbols to query for substring matches. These work much as they do in the shell: the * symbol matches zero or more characters, while the ? matches exactly one character. If we have a node named 'app1.example.com', the following queries will find it (and possibly other nodes as well):

Range Search

Chef also supports range searches. To see how these work, suppose we have a data bag named 'sample' with items 'abc', 'bar', 'baz', and 'qux'. We can search for all of the items between 'bar' and 'foo', inclusive, using the search pattern [bar TO foo] ("TO" must be capitalized):

To search an exclusive range, use curly braces, like this:

Fuzzy Search

This search feature is currently affected by a bug and does not work.

Solr supports fuzzy matching of terms using an edit distance measure. To specify a fuzzy match, add a ~ (tilde) symbol to the end of your query term. For example, if you have an API client named 'FOO', you can find it by searching for 'BOO~', like this:

Joining Multiple Query Criteria with Boolean Operators

For more exacting searches you can join multiple criteria with 'AND' or 'OR' or you can negate a query with 'NOT' (all three keywords must be capitalized).

The following examples assume you have a data bag named 'sample' with items 'abc', 'foo', 'bar', 'baz', and 'qux'.

Negating with NOT

Result:

Joining Queries with OR

Results:

Joining Queries with AND

Results:

Finding All

To find all of the items in an index, use an asterisk for both the field (key) name and search pattern, for example:

Will find all nodes.

Special Characters:

The following characters have special meaning to the query parser. If you need to use them in a search pattern, escape them with a \ (backslash):

As previously noted, Chef's search feature uses Apache SOLR, which is built on Apache Lucuene. Definitive documentation for the query syntax can be found in the projects' documentation sites:

Nested Fields

Nested Field Syntax Changes in 0.10.0
The syntax for searching within nested fields will change in Chef 0.10.0. Instead of using 'X' as the wildcard character, you must now use '*'. In addition, you will now have full wildcard expansion on the search keys. A pre-0.10.0 query written as 'network_interfaces_X_addresses:192.168' will now be written as 'network_interfaces_*_addresses:192.168' or any other wildcard expansion of the key, such as 'network*addresses:192.168'.

It is quite common in Chef for data you want to be a few levels deep in a data structure. For example, information about a network interface might be in a node's attributes at node[:network][:interfaces][:eth0]. Here's how Chef's search feature handles this. Consider this snippet of ohai data from my laptop:

Before adding this node data to the indexer, Chef extracts nested fields into the top level:

So the following searches will find this node:

Chef will also flatten nested items into compound keys, like this:

So you can also find the node with the following search:

Chef also creates "wildcard" compound keys, where 'X' is the wildcard character:

These compound keys can be extremely helpful when a key you don't care about is between two keys that you do care about. For example to find every node with an IP address starting with 192.168, regardless of the interface name, you can search for:

You could even use a range search to find every node with an address in a given subnet:

You can see that by combining these wild card fields with range and wildcard queries, it is possible to perform very powerful searches such as using the vendor part of the MAC address to find every node that has a network card made by a given vendor (Broadcom, Intel, etc.).

Using Search in Recipes

The Chef DSL provides the search(INDEX, QUERY) method for accessing search indexes in recipes. Basic use of search looks like this:

If you just want to use each result of the search and don't care about the aggregate result you can provide a code block to the search method. Each result will be passed to the block in turn:

Search for Nodes Based on Run List Entries

Find Nodes With a Given Recipe in the Run List

To find nodes with a specified recipe in the run list, just search within the run_list field. Be careful, Lucene treats : and [ as special characters, hence escaping is needed (also note the single quoting):

Use the recipe (singular!) keyword for searching in the top-level run list.

If you also want to interpolate variables into the search string using ruby's alternate quoting syntax makes this easier:

Find Nodes with a Recipe in the Expanded Run List

0.9.8 and up
This is a new feature as of Chef 0.9.8. The change is made on the client, so you'll only be able to find nodes running chef-client 0.9.8 and greater when using this feature.

Chef saves expanded list of roles and recipes to the roles and recipes attributes on the node. This makes it possible to find nodes that run a given recipe even if it's included by a role (or a role-within-a-role). For example, consider a "tomcat_server" role that includes apache2::mod_proxy_ajp in its run list. A node that is in this role would be found by the following search:

Use the recipes (plural! Note the extra "s"!) keyword for searching in the expanded run list. This field is updated on each run of chef-client, thus changes to the run list will not affect "recipes" until chef-client has been run on the node.

Find Nodes with a Role in the Run List

To find a node that includes a role in the top level of its run list, use a search of the form role:ROLE_NAME.

Use the role (singular!) keyword for searching in the top-level run list.

Find Nodes with a Role in the Expanded Run List

0.9.8 and up
This is a new feature as of Chef 0.9.8. The change is made on the client, so you'll only be able to find nodes running chef-client 0.9.8 and greater when using this feature.

Since Chef 0.9.8, chef-client automatically saves the expanded list of roles (i.e., all roles applied to a node, including nested roles) to the node's automatic attributes. You can find every node that has a given role by a search of the form roles:ROLE_NAME.

Use the roles (plural! Note the extra "s"!) keyword for searching in the expanded run list. This field is updated on each run of chef-client, thus changes to a node's run list will not affect "roles" until chef-client has been run on the node.

Client/Server Settings for a Database

If you don't have roles fully defined and implemented, you may find it necessary on a webserver to place the hostname, ip or private ip of another machine in a settings file so that it can connect to a database, solr or other persistence server. It is assumed here that if a webserver is named mysqlchef that its database server is mysqlchefutil.

Consider this simplified settings file for the webserver mysqlchef1:

You can see all the information about the node with the following command:

To access it as part of a recipe that is to run on mysqlchef (the webserver):

Note the 0 (zero) index for the db_server identifier is used because you will get a single document returned from solr because the node is being searched on its unique name. The identifier "private_ip" now has the value "10.40.64.202" and can be used in templates as a variable, etc.

Searching for nodes having a given recipe applied (pre 0.9.8)

If you want to handle role inclusion gracefully, you can patch the Chef Language (see this gist) to allow that kind of search:

Searching within Environments

To search within a specific environment, see Searching within Environments.

Problems and Solutions

The following are solutions to problems/questions that arise in use of the Search feature and in maintaining the Indexes.

Updates not propagating to search results

0.9.x and Lower
This discussion applies to Chef 0.8.x and 0.9.x. Read the guide for Chef Expander for tips on diagnosing and troubleshooting queue backlogs in Chef 0.10.x

Sometimes you will notice that your search results get stale. For instance, you may have a node visible from "knife node show", but "knife search node" does not show the node. The first thing to check is the size of your message queue:

A low number is expected here. If it gets past 15 or 20, and especially if it is constantly increasing, you need to add more chef-solr-indexer instances.

Adding indexers using Runit

Instructions on adding services is here: http://smarden.org/runit/faq.html. You can reuse the scripts from /etc/sv/chef-solr-indexer.

Fields missing from certain nodes

0.10.x and above
This discussion applies to Chef 0.10.x and above

If you are getting strange results from your search queries, like certain nodes missing from searches for attributes you know exist (such as those confirmed to exist via the webui), you may have run into solr's maxFieldLength issue described in CHEF-2346. So far this has been noted on linux servers with complex configurations and Windows machines - anywhere with a lot of attributes and Ohai data. This is currently unresolved as of 0.10.8.

Search only returns IP Address of the Node, not of a specific interface

The issue is complicated, as the JSON data for interfaces is structured as hierarchical attributes of the parent. (Where the interface has an attribute, which is an address, which has an attribute which is a family) However, the way our minds tend to work is that the interface has an address of a particularly family, usually ipv4 (inet). There is a tendency to want something like: node[:ipaddress_eth1] or node[:ipaddress][:eth1] but these diverge from the data hierarchy. OHAI-88 is based in this issue.

Pending the determination on how to best address this circumstance, the network_addr.rb Ohai plugin, from Opscode Team Member Joshua Timberman, extends the Ohai network attribute with additional ipaddrtype_iface attributes to make semantically easier to retrieve the addresses.

Additionally, the following code should get the ip address of a specific interface, as it is constructed consistently with how the JSON data for interfaces is structured.







Roles


Installation



Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.