OverviewSearch is a feature of the Chef Server that allows you to use a full-text search engine (based on Apache Solr) to query information about your infrastructure and applications.Searches are built by the Chef Server, and allow you to query arbitrary data about your infrastructure. You can utilize this service via search calls in a recipe or the knife search command.Most data that Chef stores is automatically indexed in Solr including Data Bags, API Clients, Nodes, and Roles.Search Index NamesChef's built-in types are indexed in the following search indexes (don't worry about the search syntax for now):
Data bags are indexed by the data bag's name. For example, to search for items in a data bag named 'bag_o_data' in knife, use a query like:
Query SyntaxQueries have the form "field:search_pattern" where "field" is a key in the JSON description of the objects being searched. The search pattern supports exact, wildcard, range, or fuzzy matching. The field name has limited support for wildcard matching. Note: Both the field and search pattern is case sensitive. Field Name SyntaxField names are keys in the JSON description of the object being searched. To search in a nested key, for example dmi / system / product_name, insert an underscore "_" between key names: Note that wildcard use with nested field has changed between version 0.9 and 0.10. See Nested Fields below for details. Discovering Key NamesThe definitive list of search keys are the keys in the JSON description of the object being searched. To see the keys available when searching nodes: This command will open a full JSON description of the node "staging" in the pager less. Similarly knife data bag show, knife client show, and knife role show can be used to find keys available for their respective objects. The json keys are the definitive search keys to be used in any search context (knife, recipe, Hosted Chef, etc). Wildcard Matching for Field NamesThe field name also has limited support for wildcard matching. Both the "*" and "?" wildcards (see below) can be used within a field name; however, they cannot be the first character of the field name. For example, the following are valid queries: |
Community member Christian Paredes put together a blog post on using Search in Recipes to gather information on the database nodes and then to take action based upon that defined criteria. |
Exact MatchesTo search for an exact match in a field, use a query of the form "field:TEXT_TO_MATCH". Be sure to quote any search patterns with spaces using double quotes. You'll still need to quote the entire query to prevent your shell or ruby from trying to interpret it. The best way to do this is to quote your query with single quotes and the search pattern with double quotes, as shown in the second example below. Examples: EXAMPLE RESULT: Search for a data bag item with the exact text 'Charlie the Unicorn' for the key 'comment': EXAMPLE RESULT: Wildcard Matches
Chef searches can use the familiar * (asterisk) and ? wildcard symbols to query for substring matches. These work much as they do in the shell: the * symbol matches zero or more characters, while the ? matches exactly one character. If we have a node named 'app1.example.com', the following queries will find it (and possibly other nodes as well): Range SearchChef also supports range searches. To see how these work, suppose we have a data bag named 'sample' with items 'abc', 'bar', 'baz', and 'qux'. We can search for all of the items between 'bar' and 'foo', inclusive, using the search pattern [bar TO foo] ("TO" must be capitalized): To search an exclusive range, use curly braces, like this: Fuzzy Search
Solr supports fuzzy matching of terms using an edit distance measure. To specify a fuzzy match, add a ~ (tilde) symbol to the end of your query term. For example, if you have an API client named 'FOO', you can find it by searching for 'BOO~', like this: Joining Multiple Query Criteria with Boolean OperatorsFor more exacting searches you can join multiple criteria with 'AND' or 'OR' or you can negate a query with 'NOT' (all three keywords must be capitalized). The following examples assume you have a data bag named 'sample' with items 'abc', 'foo', 'bar', 'baz', and 'qux'. Negating with NOT Result: Joining Queries with OR Results: Joining Queries with AND Results: Finding AllTo find all of the items in an index, use an asterisk for both the field (key) name and search pattern, for example: Will find all nodes. Special Characters:The following characters have special meaning to the query parser. If you need to use them in a search pattern, escape them with a \ (backslash): As previously noted, Chef's search feature uses Apache SOLR, which is built on Apache Lucuene. Definitive documentation for the query syntax can be found in the projects' documentation sites:
Nested Fields
It is quite common in Chef for data you want to be a few levels deep in a data structure. For example, information about a network interface might be in a node's attributes at node[:network][:interfaces][:eth0]. Here's how Chef's search feature handles this. Consider this snippet of ohai data from my laptop: Before adding this node data to the indexer, Chef extracts nested fields into the top level: So the following searches will find this node: Chef will also flatten nested items into compound keys, like this: So you can also find the node with the following search: Chef also creates "wildcard" compound keys, where 'X' is the wildcard character: These compound keys can be extremely helpful when a key you don't care about is between two keys that you do care about. For example to find every node with an IP address starting with 192.168, regardless of the interface name, you can search for: You could even use a range search to find every node with an address in a given subnet: You can see that by combining these wild card fields with range and wildcard queries, it is possible to perform very powerful searches such as using the vendor part of the MAC address to find every node that has a network card made by a given vendor (Broadcom, Intel, etc.). Using Search in RecipesThe Chef DSL provides the search(INDEX, QUERY) method for accessing search indexes in recipes. Basic use of search looks like this: If you just want to use each result of the search and don't care about the aggregate result you can provide a code block to the search method. Each result will be passed to the block in turn: Search for Nodes Based on Run List EntriesFind Nodes With a Given Recipe in the Run ListTo find nodes with a specified recipe in the run list, just search within the run_list field. Be careful, Lucene treats : and [ as special characters, hence escaping is needed (also note the single quoting):
If you also want to interpolate variables into the search string using ruby's alternate quoting syntax makes this easier: Find Nodes with a Recipe in the Expanded Run List
Chef saves expanded list of roles and recipes to the roles and recipes attributes on the node. This makes it possible to find nodes that run a given recipe even if it's included by a role (or a role-within-a-role). For example, consider a "tomcat_server" role that includes apache2::mod_proxy_ajp in its run list. A node that is in this role would be found by the following search:
Find Nodes with a Role in the Run ListTo find a node that includes a role in the top level of its run list, use a search of the form role:ROLE_NAME.
Find Nodes with a Role in the Expanded Run List
Since Chef 0.9.8, chef-client automatically saves the expanded list of roles (i.e., all roles applied to a node, including nested roles) to the node's automatic attributes. You can find every node that has a given role by a search of the form roles:ROLE_NAME.
Client/Server Settings for a DatabaseIf you don't have roles fully defined and implemented, you may find it necessary on a webserver to place the hostname, ip or private ip of another machine in a settings file so that it can connect to a database, solr or other persistence server. It is assumed here that if a webserver is named mysqlchef that its database server is mysqlchefutil. Consider this simplified settings file for the webserver mysqlchef1: You can see all the information about the node with the following command: To access it as part of a recipe that is to run on mysqlchef (the webserver): Note the 0 (zero) index for the db_server identifier is used because you will get a single document returned from solr because the node is being searched on its unique name. The identifier "private_ip" now has the value "10.40.64.202" and can be used in templates as a variable, etc. Searching for nodes having a given recipe applied (pre 0.9.8)If you want to handle role inclusion gracefully, you can patch the Chef Language (see this gist) to allow that kind of search: Searching within EnvironmentsTo search within a specific environment, see Searching within Environments. Problems and SolutionsThe following are solutions to problems/questions that arise in use of the Search feature and in maintaining the Indexes. Updates not propagating to search results
Sometimes you will notice that your search results get stale. For instance, you may have a node visible from "knife node show", but "knife search node" does not show the node. The first thing to check is the size of your message queue: A low number is expected here. If it gets past 15 or 20, and especially if it is constantly increasing, you need to add more chef-solr-indexer instances. Adding indexers using RunitInstructions on adding services is here: http://smarden.org/runit/faq.html. You can reuse the scripts from /etc/sv/chef-solr-indexer. Fields missing from certain nodes
If you are getting strange results from your search queries, like certain nodes missing from searches for attributes you know exist (such as those confirmed to exist via the webui), you may have run into solr's maxFieldLength issue described in CHEF-2346. So far this has been noted on linux servers with complex configurations and Windows machines - anywhere with a lot of attributes and Ohai data. This is currently unresolved as of 0.10.8. Search only returns IP Address of the Node, not of a specific interfaceThe issue is complicated, as the JSON data for interfaces is structured as hierarchical attributes of the parent. (Where the interface has an attribute, which is an address, which has an attribute which is a family) However, the way our minds tend to work is that the interface has an address of a particularly family, usually ipv4 (inet). There is a tendency to want something like: node[:ipaddress_eth1] or node[:ipaddress][:eth1] but these diverge from the data hierarchy. OHAI-88 is based in this issue. Pending the determination on how to best address this circumstance, the network_addr.rb Ohai plugin, from Opscode Team Member Joshua Timberman, extends the Ohai network attribute with additional ipaddrtype_iface attributes to make semantically easier to retrieve the addresses. Additionally, the following code should get the ip address of a specific interface, as it is constructed consistently with how the JSON data for interfaces is structured.
|
|
|

