Opscode
Home     Cookbooks     Blog     GitHub     Tickets 

Search Indexes

Search Indexes are a feature of the Chef Server that allow you to use a full text search engine (based on Apache SOLR) to query information about your infrastructure and applications. Most data that Chef stores in CouchDB is automatically indexed in SOLR: Data Bags, [API Clients], Nodes, and Roles are all indexed.

Search Index Names

Chef's built-in types are indexed in the following search indexes (don't worry about the search syntax for now):

Data Type Index Name Example Knife Search
Roles role knife search role "name:production*"
Nodes node knife search node "name:app*"
[API Clients] client knife search client "name:c*"

Data bags are indexed by the data bag's name. For example, to search for items in a data bag named 'bag_o_data' in knife, you'd make your query like this:

Query Syntax

Queries have a basic form of "field:search_pattern"; in Chef the fields will be keys in the JSON data. Search patterns can be for an exact match, wildcard match, range match, or fuzzy match.

Finding All

To find all of the items in an index, use an asterisk for both the field (key) name and search pattern, for example:

Will find all nodes.

Exact Matches

To search for an exact match in a field, use a query of the form "field:TEXT_TO_MATCH". Be sure to quote any search patterns with spaces using double quotes. You'll still need to quote the entire query to prevent your shell or ruby from trying to interpret it. The best way to do this is to quote your query with single quotes and the search pattern with double quotes, as shown in the second example below.

Examples:
Search for a data bag item in the 'bag_o_data' data bag with the exact value of 'some_value' for the key 'some_field':

RESULT:

Search for a data bag item with the exact text 'quote strings with spaces':

RESULT

Wildcard Matches

Chef searches can use the familiar * (asterisk) and ? wildcard symbols to query for substring matches. These work much as they do in the shell: the * symbol matches zero or more characters, while the ? matches exactly one character. If we have a node named 'app1.example.com', the following queries will find it (and possibly other nodes as well):

'*' or '?' not allowed as first character in WildcardQuery

SOLR doesn't allow using a wildcard character as the first character in a search pattern. This means you can not do things like search for 'name:*example.com' to find all the hosts in the example.com domain. In the future, Chef may provide some mechanism to work around this restriction or make it less onerous.

Range Search

Chef also supports range searches. To see how these work, suppose we have a data bag named 'sample' with items 'abc', 'bar', 'baz', and 'qux'. We can search for all of the items between 'bar' and 'foo', inclusive, using the search pattern [bar TO foo]:

The result will be something like this:

To search an exclusive range, use curly braces, like this:

Fuzzy Search

It's not immediately clear how this feature is useful, but SOLR supports fuzzy matching of terms using an edit distance measure. To specify a fuzzy match, add a ~ (tilde) symbol to the end of your query term. For example, if you have an API client named 'FOO', you can find it by searching for 'BOO~', like this:

Joining Multiple Query Criteria with Boolean Operators

For more exacting searches you can join multiple criteria with 'AND' or 'OR' or you can negate a query with 'NOT'.

The following examples assume you have a data bag named 'sample' with items 'abc', 'foo', 'bar', 'baz', and 'qux'.

Negating with NOT

Result:

Joining Queries with OR

Results:

Joining Queries with AND

Results:

Special Characters:

The following characters have special meaning to the query parser. If you need to use them in a search pattern, escape them with a \ (backslash):

As previously noted, Chef's search feature uses Apache SOLR, which is built on Apache Lucuene. Definitive documentation for the query syntax can be found in the projects' documentation sites:

Nested Fields

It is quite common in Chef for data you want to be a few levels deep in a data structure. For example, information about a network interface might be in a node's attributes at node[:network][:interfaces][:eth0]. Here's how Chef's search feature handles this. Consider this snippet of ohai data from my laptop:

Before adding this node data to the indexer, Chef extracts nested fields into the top level:

So the following searches will find this node:

Chef will also flatten nested items into compound keys, like this:

So you can also find the node with the following search:

Chef also creates "wildcard" compound keys, where 'X' is the wildcard character:

These compound keys can be extremely helpful when a key you don't care about is between two keys that you do care about. For example to find every node with an IP address starting with 192.168, regardless of the interface name, you can search for:

You could even use a range search to find every node with an address in a given subnet:

You can see that by combining these wild card fields with range and wildcard queries, it is possible to perform very powerful searches such as using the vendor part of the MAC address to find every node that has a network card made by a given vendor (Broadcom, Intel, etc.).

Using Search in Recipes

The Chef DSL provides the search(INDEX, QUERY) method for accessing search indexes in recipes. Basic use of search looks like this:

If you just want to use each result of the search and don't care about the aggregate result you can provide a code block to the search method. Each result will be passed to the block in turn:

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Copyright © 2009 Opscode, Inc. All Rights Reserved.