Deep Search Query Language

Deep Search has its own query language, Deep Search Query Language (DSQL) which allows you to query your indexes using a very simple, intuitive format. You can construct and send queries using the Deep Search API or via the Searchbar widget.

Query format

Deep Search queries are easily constructed to search in a semantic fashion rather than trying to come up with an exhaustive list of keywords. You can search by entity type, various metadata about those entities, theme, sentiment and even language of the document. The Deep Search Javascript Searchbar will provide hints for you as to how to form your queries.

The structure for Raw queries as expressed in EBNF is:

Query = Expr ["AND"|"OR" Expr]* ;
Expr = Classifcation-Expr|Theme-Expr|Sentiment-Expr|Lang-Expr ;
Theme-Expr = "theme:" , | "politics" | "healthcare" ... ;
Sentiment-Expr = "sentiment:" , ("negative" | "positive" | "neutral") ;
Lang-Expr = "lang:" , ("english" | "french" | "arabic" | ...);
Classification-Expr = Classification-Metadata-Expr | Classification-Any-Expr ;
Classification-Any-Expr = Classification , ":*" ;
Classification-Metadata-Expr = Classification , "." , Metadata-Key , Operator , Metadata-Value;
Classification = "Person.politician" | "Location.city" ... ;
Operator = ":" | ">" | "<" ;
Metadata-Key = "age" | "gender" | "birth_place" | "gdp" ... ;
Metadata-Value = [0-9|a-z|A-Z]+ ;

Putting the above together, examples of valid queries are:

  • Product.food:*
  • Person.politician.age>58
  • Org.business.revenue>100B
  • theme:sports lang:ar sentiment:negative

For a list of all classifications and metadata clauses available, consult the API.

Language, sentiment & theme filters

As noted about, queries can include clauses to filter by one (or more) of language, sentiment and theme. In effect, Deep Search allows you to state your query in English but obtain results in any other language. The query below for example:

(lang:ar OR lang:ru) sentiment:negative theme:sports

would return documents whose theme is sports, have a negative sentiment and are in either Arabic or Russian.

Geolocation

Deep Search extracts any geographic information that can be inferred from a document. Any entity of type Location.* will have its lat/long coordinates extracted and used to augment the document for geofencing searches.

To restrict a search to a certain geographic radius around a point, you specify a center point and then a radius. For example:

near:"Toronto" within:50

which translates to finding any document mentioning a location within 50km of Toronto (the unit of distance is always in kilometres). This can be a landmark, building, park, neighbourhood, town square - anything - as long as it's been previously classified by Repustate.

The within parameter is optional and if omitted, defaults to 100km.

Summary

The table below summarizes the each term that is allowed in a raw query:

Term Example Notes
Classification (any) Person.politician:*
Classification (with metadata) Person.politician.age>50 When specifying large numbers, you can use K, M, or B as shortcuts to represent thousand, million and billion respectively (e.g. 100B)
Theme theme:sports You can have multiple theme terms
Sentiment sentiment:negative You can have multiple sentiment terms
Langauge lang:ru You can have multiple langauge terms
Geolocation near:Toronto within:200 You can only have one near term in query. The within qualifier is optional and defaults to 100km if omitted.

You can see a complete listing of all classifications, metadata and themes here.