Query Functions

Please refer to Demo Index Setup to follow the examples given on the page.

How query works?

In order to understand how the different query types work, it would be beneficial to understand the basic processing that happens before a query is processed. The following flow chart defines the steps that a query string goes through before being submitted to the underlying Lucene engine:

Parse Query String & Generate ASTProcess conditions from ASTValidate Condition including:- Field Name- Field Type- Field Values- Query TypeQuery type supports search analyzer?noyesConvert value to lower caseDoes field type support analyzer?yesnoParse value using the search analyzerConvert value to internal representationQuery Type ProcessorGenerate Lucene QueryExecute Query

  1. The query string is processed using a query parser which generates an abstract syntax tree from the input. Think of abstract syntax tree as a machine understandable representation of the original query string. The AST contains information about the query types and the clauses.

  2. Process each individual condition from the AST.

  3. Each condition goes through multiple validation steps which includes things like checking that the field name is valid, the query type is valid and valid field values are provided. If no valid value is specified then the query processor looks for the defined missing field behaviour.

  4. Each query type has different behaviour associated with it. For example some query types are designed to work with multiple tokens, phrases, positional matching etc. Once we have identified the correct query type for the condition then it is checked if the query type supports search analyzer. Certain query types avoid search time analysis to avoid tampering with the input. For example like query type avoids search analyzer as the analyzer may remove certain special characters like ? and *.

  5. In case the query type supports an analyzer then the provided analyzer is used or the input text is converted to the internal representation for the given field type. For example Boolean fields are internally saved as T and F to represent true and false values. This is done to save disk space. These fields don’t support sophisticated analysis so they are configured not to use an analyzer. In case the query type does not support a search time analyzer then the input value is simply converted into lowercase to force case insensitive matching.

  6. The newly passed tokens are passed into the query type processor which generates a Lucene equivalent query from it. A query type processor is also responsible for processing query type specific switches.

  7. Once all the query types are processed then a master Lucene query is built which is executed to produce search results.

General Query properties

These properties are specified for each query type in the information box provided at the beginning of each query documentation.

Search Analyzer support & Boolean generation behaviour

This property signifies if a query type supports search time analysis or not? As explained earlier some query types bypass search time analysis to avoid removal of special characters from the input text.

To illustrate an example of search analyzer support, when the below query is executed:

allof(agriproducts, 'Rice wheat', 'BARLEY')

It will be converted into three queries as shown below:

allof(agriproducts, 'rice') AND allof(agriproducts, 'wheat') AND allof(agriproducts, 'BARLEY')

Here are two things have happened:

  1. The original query has two tokens Rice wheat and BARLEY. After going through the search analyzer the two tokens will get converted into three tokens (this is assuming that a standard analyzer is used at search time). The three new tokens are: rice,wheat and barley.

  2. The second thing which happens is the Boolean behaviour. In this case the three tokens are joined using a andoperator. Each query type has different configured behaviour when it comes to generating Boolean query. Boolean queries are automatically generated when possible.

Positional match support

This property signifies if the query type supports positional matching of the tokens in the field value. Query types which do not support positional matching will match a given token or tokens anywhere in the corpus. Usually the query types which support positional matching will associate special meaning to the order in which the field values are specified for a given query type.

Field Values Order

This property signifies if the query type gives any special emphasis to the order in which multiple field values are specified. This is usually associated with the positional match support.

For example, the AllOf query supports multiple field values per clause and the order in which the field values are specified does not matter.

allof(agriproducts, 'rice', 'wheat')

is same as

allof(agriproducts, 'wheat', 'rice')

Multiple field values per clause

This property signifies if a user can pass multiple search values but clause. This is helpful in simplifying the query in case the same query type is to be used for multiple values.

allof(agriproducts, 'rice', 'wheat')

In the above example the query type has support for multiple field values per clause. Here we have passed rice and wheat as two separate field values in the same query clause.

AllOf

AllOf query is the simplest of the term related queries which forces all the specified terms to match in a given input. This query does not take position in consideration and will match terms out of order.

Information Box
Search AnalyzerSupported
Boolean behaviourAnd
Positional matchUnsupported
Field Value orderDoes not matter
Multiple field values per clauseYes

Query Examples

The following search query returns all documents containing Wheat and Rice both, in the agriproducts field.

AllOf single clause with a single token
allOf(agriproducts, 'rice wheat')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available12
Brazilcoffee soybeans wheat rice corn sugarcane cocoa citrus; beeffederal republic198739269
Iraqwheat barley rice vegetables dates cotton; cattle sheep poultryparliamentary democracy28945657
Pakistancotton wheat rice sugarcane fruits vegetables; milk beef mutton eggsfederal republic176242949
Uruguayrice wheat soybeans barley; livestock beef; fish; forestryconstitutional republic3494382
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883

The above query is semantically similar to the below queries:

allof(agriproducts, 'rice') and allof(agriproducts, 'wheat')
allof(agriproducts, 'rice', 'wheat')
allof(agriproducts, 'wheat rice')

AnyOf

AnyOf query is the simplest of the term related queries which forces one of the specified term to match in a given input. This query does not take position in consideration and will match terms out of order.

Information Box
Search AnalyzerSupported
Boolean behaviourOr
Positional matchUnsupported
Field Value orderDoes not matter
Multiple field values per clauseYes

Query Examples

The following search query returns all documents containing Wheat or Rice or both, in the agriproducts field.

AnyOf single clause with a single token
anyOf(agriproducts, 'rice wheat')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available113
Brazilcoffee soybeans wheat rice corn sugarcane cocoa citrus; beeffederal republic198739269
Iraqwheat barley rice vegetables dates cotton; cattle sheep poultryparliamentary democracy28945657
Pakistancotton wheat rice sugarcane fruits vegetables; milk beef mutton eggsfederal republic176242949
Uruguayrice wheat soybeans barley; livestock beef; fish; forestryconstitutional republic3494382
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883

The above query is semantically similar to the below queries:

anyOf(agriproducts, 'rice') and anyOf(agriproducts, 'wheat')
anyOf(agriproducts, 'rice', 'wheat')
anyOf(agriproducts, 'wheat rice')

Phrase match

A Query that matches documents containing a particular sequence of terms.

Information Box
Search AnalyzerSupported
Boolean behaviourOr
Positional matchSupported
Field Value orderMatters
Multiple field values per clauseYes

Query Examples

The following search query returns all documents containing the 3 words federal parliamentary democracy is exactly the same order.

Phrase search passing multiple words as single token
phraseMatch(governmenttype, 'federal parliamentary democracy')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available2
Australiawheat barley sugarcane fruits cattle sheep poultryfederal parliamentary democracy and a Commonwealth realm21262641
Belgiumsugar beets fresh vegetables fruits grain tobacco; beef veal pork milkfederal parliamentary democracy under a constitutional monarchy10414336

Unlike the previous query types, phrase match input has positional relevance. Here instead of passing a single token as fedral parliamentary democracy, if we pass them as three tokens the overall result will be different as the query will be treated as 3 phrase match queries and will be joined using an OR operator.

Be careful with phrase matches as the order of token and the number of tokens can affect the search results drastically.

Phrase search passing multiple words as multiple tokens
phraseMatch(governmenttype, 'federal', 'parliamentary', 'democracy')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available83
Australiawheat barley sugarcane fruits cattle sheep poultryfederal parliamentary democracy and a Commonwealth realm21262641
Belgiumsugar beets fresh vegetables fruits grain tobacco; beef veal pork milkfederal parliamentary democracy under a constitutional monarchy10414336
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883
Arubaaloes; livestock; fishparliamentary democracy103065
Bulgariavegetables fruits tobacco wine wheat barley sunflowers sugar beets; livestockparliamentary democracy7204687

Phrase query also supports slop parameter. By default the slop is set to 0 which means match in exact order. A minimum slop of 2 is required to change the order of the terms.

Specifying slop in phrase query does not maintain the order of the terms. The query is reduced to a term query with the terms being in the specified range of each other.

Phrase search with slop of 4
phraseMatch(governmenttype, 'parliamentary monarchy', -slop '4')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available6
Spaingrain vegetables olives wine grapes sugar beets citrus; beef pork poultry dairy products; fishparliamentary monarchy40525002
Lesothocorn wheat pulses sorghum barley; livestockparliamentary constitutional monarchy2130819
Japanrice sugar beets vegetables fruit; pork poultry dairy products eggs; fisha parliamentary government with a constitutional monarchy127078679
Greenlandforage crops garden and greenhouse vegetables; sheep reindeer; fishparliamentary democracy within a constitutional monarchy57600
Belgiumsugar beets fresh vegetables fruits grain tobacco; beef veal pork milkfederal parliamentary democracy under a constitutional monarchy10414336

Below query demonstrated the behaviour when slop is used to match the same words from the above query but in reverse order.

Phrase search with slop of 4
phraseMatch(governmenttype, 'monarchy parliamentary', -slop '4')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available3
Spaingrain vegetables olives wine grapes sugar beets citrus; beef pork poultry dairy products; fishparliamentary monarchy40525002
Lesothocorn wheat pulses sorghum barley; livestockparliamentary constitutional monarchy2130819
Antigua and Barbudacotton fruits vegetables bananas coconuts cucumbers mangoes sugarcane; livestockconstitutional monarchy with a parliamentary system of government and a Commonwealth realm85632

Phrase match query also supports an additional switch: multiphrase. This switch can be used to enforce additional positional matching at the same position. For example let’s say we want to match word parliamentary followed by either democracy or system. This can be easily accomplished by using multiphrase switch.

Match both phrases containing ‘parliamentary democracy’ and ‘parliamentary system’
phraseMatch(governmenttype, 'parliamentary', 'democracy system', -multiphrase)
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available43
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883
Arubaaloes; livestock; fishparliamentary democracy103065
Bulgariavegetables fruits tobacco wine wheat barley sunflowers sugar beets; livestockparliamentary democracy7204687
Czech Republicwheat potatoes sugar beets hops fruit; pigs poultryparliamentary democracy10211904
Dominicabananas citrus mangoes root crops coconuts cocoa; forest and fishery potential not exploitedparliamentary democracy72660
Match phrases containing ‘parliamentary democracy’, ‘parliamentary system’ and ‘parliamentary constitutional’
phraseMatch(governmenttype, 'parliamentary', 'democracy system constitutional', -multiphrase)
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available44
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883
Arubaaloes; livestock; fishparliamentary democracy103065
Bulgariavegetables fruits tobacco wine wheat barley sunflowers sugar beets; livestockparliamentary democracy7204687
Czech Republicwheat potatoes sugar beets hops fruit; pigs poultryparliamentary democracy10211904
Dominicabananas citrus mangoes root crops coconuts cocoa; forest and fishery potential not exploitedparliamentary democracy72660
Match phrases containing ‘parliamentary monarchy’ and ‘constitutional monarchy’
phraseMatch(governmenttype, 'constitutional parliamentary', 'monarchy', -multiphrase)
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available22
Bhutanrice corn root crops citrus foodgrains; dairy products eggsconstitutional monarchy691141
Bahrainfruit vegetables; poultry dairy products; shrimp fishconstitutional monarchy727785
Denmarkbarley wheat potatoes sugar beets; pork dairy products; fishconstitutional monarchy5500510
Liechtensteinwheat barley corn potatoes; livestock dairy productsconstitutional monarchy34761
Jordancitrus tomatoes cucumbers olives; sheep poultry stone fruits strawberries dairyconstitutional monarchy6342948

Fuzzy

Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm. At most, this query will match terms up to 2 edits. Higher distances, are generally not useful and will match a significant amount of the term dictionary.

Information Box
Search AnalyzerSupported
Boolean behaviourOr
Positional matchUnsupported
Field Value orderDoes not matter
Multiple field values per clauseYes
ParameterDefaultTypeDescription
prefixlength0intLength of common (non-fuzzy) prefix.
slop1intThe number of allowed edits

Query Examples

The following search query returns all documents containing Iran and all documents containing Iran with 1 character difference, in the countryname field.

Fuzzy with default slop of 1
fuzzy(countryname, 'Iran')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available2
Iranwheat rice other grains sugar beets sugar cane fruits nuts cotton; dairy products wool; caviartheocratic republic66429284
Iraqwheat barley rice vegetables dates cotton; cattle sheep poultryparliamentary democracy28945657

The following search query demonstrates the use of slop operator. It returns all countries similar to China with a difference of two characters.

Fuzzy with slop of 2
fuzzy(countryname, 'China', -slop '2')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available3
Chinarice wheat potatoes corn peanuts tea millet barley apples cotton oilseed; pork; fishCommunist state1338612968
Chilegrapes apples pears onions wheat corn oats peaches garlic asparagus beans; beef poultry wool; fish; timberrepublic16601707
Ghanacocoa rice cassava (tapioca) peanuts corn shea nuts bananas; timberconstitutional democracy23832495

Like

Implements the wildcard search query. Supported wildcards are *, which matches any character sequence (including the empty one), and ?, which matches any single character. Note this query can be slow, as it needs to iterate over many terms.

Information Box
Search AnalyzerUnsupported
Boolean behaviourOr
Positional matchUnsupported
Field Value orderDoes not matter
Multiple field values per clauseYes

In order to prevent extremely slow Wildcard Queries, a Wildcard term should not start with the wildcard *.

Query Examples

The following search query returns all documents with uni coming anywhere in the word.

Like using ‘*’ operator
like(countryname, 'uni*')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available5
European Unionwheat barley oilseeds sugar beets wine grapes; dairy products cattle sheep pigs poultry; fish491582852
United Arab Emiratesdates vegetables watermelons; poultry eggs dairy products; fishfederation with specified powers delegated to the UAE federal government and other powers reserved to member emirates4798491
United Kingdomcereals oilseed potatoes vegetables; cattle sheep poultry; fishconstitutional monarchy and Commonwealth realm61113205
United Stateswheat corn other grains fruits vegetables cotton; beef pork poultry dairy products; fish; forest productsConstitution-based federal republic; strong democratic tradition307212123
United States Pacific Island Wildlife Refuges0

The following query will match any word where it starts with Unit followed by any single character and ends with d.

Like with single character operator
like(countryname, 'unit?d')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available4
United Arab Emiratesdates vegetables watermelons; poultry eggs dairy products; fishfederation with specified powers delegated to the UAE federal government and other powers reserved to member emirates4798491
United Kingdomcereals oilseed potatoes vegetables; cattle sheep poultry; fishconstitutional monarchy and Commonwealth realm61113205
United Stateswheat corn other grains fruits vegetables cotton; beef pork poultry dairy products; fish; forest productsConstitution-based federal republic; strong democratic tradition307212123
United States Pacific Island Wildlife Refuges0

Regex

A fast regular expression query based on the org.apache.lucene.util.automaton package. Comparisons are fast.

Information Box
Search AnalyzerUnsupported
Boolean behaviourOr
Positional matchUnsupported
Field Value orderDoes not matter
Multiple field values per clauseYes

The term dictionary is enumerated in an intelligent way, to avoid comparisons. The supported syntax is documented in the Java RegExp class.

This query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow RegexpQueries, a Regexp term should not start with the expression *.

Query Examples

The following search query matches all the documents containing silk and milk.

Simple regex match
regex(agriproducts, '[ms]ilk')
countrynameagriproductsgovernmenttypepopulation
*For brevity only up to 5 results are displayed here.Total available28
Bangladeshrice jute tea wheat sugarcane potatoes tobacco pulses oilseeds spices fruit; beef milk poultryparliamentary democracy156050883
Belarusgrain potatoes vegetables sugar beets flax; beef milkrepublic in name although in fact a dictatorship9648533
Belgiumsugar beets fresh vegetables fruits grain tobacco; beef veal pork milkfederal parliamentary democracy under a constitutional monarchy10414336
Burundicoffee cotton tea corn sorghum sweet potatoes bananas manioc (tapioca); beef milk hidesrepublic8988091
Cambodiarice rubber corn vegetables cashews tapioca silkmultiparty democracy under a constitutional monarchy14494293

Numeric range operator

A Query that matches numeric values within a specified range. To use this, you must first index the numeric values using Int, Long, DateTime, Date or Double.

Information Box
Search AnalyzerUnsupported
Boolean behaviourUnsupported
Positional matchUnsupported
Field Value orderShould be the first token
Multiple field values per clauseNo

Range supports gt (greater than), ge (greater or equal), lt (less than) and le (less or equal) functions.

Query Examples

gt(population, '1000000')
ge(population, '1000000')
lt(population, '1000000')
le(population, '1000000')

Match all

A query that matches all documents. It is a useful query to iterate over all documents in an index.

Information Box
Search AnalyzerUnsupported
Boolean behaviourUnsupported
Positional matchUnsupported
Field Value orderUnsupported
Multiple field values per clauseUnsupported

Query Examples

The following search query matches all the documents in the index.

matchall(countryname, '*')

Match None

A query that matches no documents. It is a useful query to ensure that a clause never matched anything under specific conditions.

Information Box
Search AnalyzerUnsupported
Boolean behaviourUnsupported
Positional matchUnsupported
Field Value orderUnsupported
Multiple field values per clauseUnsupported