Query Format

FlexSearch utilizes custom query format which enables advance customization with minimal effort. Score manipulation, short circuiting of clauses etc can be achieved by using simple switches.

Let’s start understanding Query syntax by going through some basic stuff.

Identifier

An identifier is any set of alphanumeric characters without (, ) and space characters.

Any ALPHANUMERIC character except ( or ) or space

Identifiers are used to represent field names and query names in the engine.

Examples

firstname, allOf, anyOf

Constant

A contant is any set of unicode characters between single quote. Back slash can be used to escape a single quote in the input. The reason to use Single Quote to represent constants in the engine is to allow easy embedding of the queries in JSON objects.

'Any UNICODE character except '\' single quote'

A constant is used to represent search values in a query.

Examples

'United Kingdom', Andrew\'s Car

Variable

A variable is an identifier preceded by a @ character.

Variables are used to represent dynamic values in a query. These values can be passed by user or can be calculated using scripts etc.

Examples

Example: @firstname, @exchangerate

Switch

A switch is a key value pair where the value part is optional. It is used for configuring the query behaviour.

Examples

-filter, -matchall, -constantscore '2'

Global Switch

These are global in sense that these can be applied to any query type.

MatchAll switch

This switch basically short circuits the query in case no value is provided for the field to be searched. This is useful if you don’t want a condition to be applicable when there is no value to be searched.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, @firstname, -matchall)

In the above example if the user doesn’t provide a value for the variable @firstname then the condition will be ignored. The query will effectively be short circuited to:

anyOf(lastname, 'smith', 'doe') AND *

This construct is useful when preforming duplicate detection over a set of uncleansed data. So, the queries can easily handle missing values.

MatchNone switch

This switch is basically the reverse of MatchAll switch, it forces no match in case no value is provided for the field to be searched. This is useful if you don’t want a condition to be match anything when there is no value to be searched.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, @firstname, -matchnone)

In the above example if the user doesn’t provide a value for the variable @firstname then the condition will be force the clause to match no documents. The query will effectively be short circuited to:

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'non existing value')

MatchFieldDefault switch

This switch uses the field’s default value in case no value is provided for the field to be searched. This is useful if you don’t want a condition to be match anything but the field’s default value when there is no value to be searched.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, @firstname, -matchFieldDefault)

In the above example if the user doesn’t provide a value for the variable @firstname then the condition will be force the clause to use null as the search value. The query will effectively transformed to:

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'null')

If the field is a numeric type then the corresponding default numeric value will be used.

UseDefault switch

This switch uses the default value provided in the switch in case no value is provided for the field to be searched. This is useful if you don’t want a condition to be match anything but the default value when there is no value to be searched.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, @firstname, -useDefault 'jimmy')

In the above example if the user doesn’t provide a value for the variable @firstname then the condition will be force the clause to use jimmy as the search value. The query will effectively transformed to:

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'jimmy')

Boost switch

This switch boosts the score of a matching condition by a factor provided as part of the switch.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'roger', -boost '2')

In the above example the score of the 2nd anyOf condition will be increased by a factor of 2 if the firstname matches roger. This is useful to improve the relative priority of certain conditions compared to other conditions.

ConstantScore switch

This switch provides a constant score to a matching condition.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'roger', -constantScore '2')

In the above example the score of the 2nd anyOf condition will have a score of 2 if the firstname matches roger.

NoScore switch

This switch removes any socre associated with a matching condition. This is equivalent to forcing the condition to act as a filter.

anyOf(lastname, 'smith', 'doe') AND anyOf(firstname, 'roger', -noScore)

In the above example the 2nd anyOf condition will not contribute to the overall score.

This is useful to remove filtering clauses from contributing to the overrall score. For example if you have a field called state which saves the state of the record. You would not like it to contribute to the overall score as most of the records will have it set to ‘active’.

Query specific switch

These are specific to a particular query type and are used to fine tune the query behaviour. Applying these to unsupported query types will not result in an error but will definately produce unexpected behaviour.

Condition

A condition is the smallest unit of a query which specifies the search criteria to be applied for a single field. A single condition is a valid query.

A condition at minimum requires:

  • Name of the query
  • The field on which the query is to be applied
  • Atleast a single source of value for the query. This can come from a variable or constant.

QueryOperator and FieldName are identifiers.

Examples

anyOf(firstname, ‘roger’, @firstname)

FlexSearch supports a number of query operators, more explanation about these can be accessed from the Query Types section.

Query

A query is basically a group of conditions which can be combined together with AND, OR and parentheses.

Purely negative queries (i.e. queries with top level Not operation) are not supported.

The parser implements operator precedence as NOT >> AND >> OR.

EBNF Format

Below is the Query syntax in EBNF format. Copy and paste it to http://www.bottlecaps.de/rr/ui to generate the above rail road diagrams.

Query ::= Condition ('OR' Query | 'AND' Query)? | '(' Query ')'
Condition ::= 'NOT'? QueryOperator '(' FieldName (',' (Variable | Constant | Switch))+ ')'
Identifier ::= ("Any ALPHANUMERIC character except ( or ) or space")+
Constant ::= "'" ("Any UNICODE character except '" | "\" "' single quote")* "'"
Variable ::= '@' Identifier
Switch ::= '-' Identifier Constant?
QueryOperator ::= Identifier
FieldName ::= Identifier