Overview

Deep Search allows you to search any text by the semantic classification of the entities mentioned in your text. This means you can find any mention of a US politician or an Italian soccer player without having to supply keyword lists or train your own language models.

This walkthrough guides you through how Repustate classifies your semantic data and the numerous ways you can query it.

Let's start ››

Type classifications

Repustate knows about millions of entities and their properties. All entities are classified into one (or more) of the following types:

Type Entities Examples
Event war, natural disaster, sporting event World War II, 9/11, Hurricane Katrina, Super Bowl 50
Health medical_conditions, drugs, muscle Polio, Lipitor, quadricep
Location country, city, state, tourist attraction Toronto, California, Eiffel Tower, Grand Canyon, Sahara Desert
Org business, government, charity, sports team Apple Inc., Tesla Motors, United Nations, Manchester United, CIA
Person politician, professional athlete, scientist, actor Barack Obama, Lionel Messi, Meryl Streep, Albert Einstein
Product food, vehicles, games pizza, Tesla Model 3, chess
Science animal, planet, chemical elephant, Jupiter, sodium chloride
Technology software, mobile phone, programming language Microsoft Excel, iPhone X, Java
Time day of the week, month, specific date Monday, March 15th, yesterday

As you add text documents to your search index, Repustate will automatically identify all entities, classify them accordingly and store this information in your search index, allowing you to query it instantly.

Entities ››

Entities

Each classification type has multiple "entities". For example, the Location classification contains, among others, the following entities:

  • city
  • state
  • country
  • continent

Query terms take the form of Classification.type:value where value is what you're looking for. (You can also use an "*" to mean match anything for this type.) For example, to search for any mention of a location in Canada (or Canada itself), you would use this as your query term:

Location.country:Canada Run

You can also combine query terms using boolean operators to construct more complex queries:

Location.country:Canada or Location.city:Boston Run Entity Metadata ››

Entity metadata

Not all entities are the same so it makes sense that Deep Search allows you to search different entities according to different criteria. Each entity has its own unique set of properties, or metadata, that you can search against. For example, businesses have market cap as a metadata property, while politicians have affiliated political party and nationality.

In total, Repustate tracks over 100 metadata properties across all classified entities. Let's search for mentions of any politicians who are both female and from the US:

Person.politician.nationality:US Person.politician.gender:F Run

Being able to search by metadata allows you to really dig deep into your data and get at just the data you want, no more, no less. Let's find all documents mentioning US companies worth more than 50 billion:

Org.business.market_cap>50B Org.business.headquarters:US Run

Repustate provides API calls to help you discover all possible entities and their associated metadata.

Query Structure ››

Deep Search query structure

Deep Search's intuitive query language allows you to combine multiple entity searches into one query to uncover the exact documents you want and no more. Convenient shortcuts allow you to query metadata information in a very short, concise, yet expressive manner.

By default, all query terms are "AND'd" together, meaning a document must satisfy all query predicates in order to be returned in a search. But you can use the "or" keyword to indicate with predicates can optionally match. Parentheses can also be used to construct more complex queries.

Org.business.market_cap>50B (Org.business.headquarters:US or Org.business.headquarters:UK) Run

To search against numerical metadata properties, Deep Search offers some shortcuts to express large numbers:

  • K = thousand
  • M = million
  • B = billion
  • T = trillion

In the query above, "50B" means 50 billion. When querying financial figures, all figures are assumed to be in USD.

Other search filters ››

Other search filters

To further refine your search results, Repustate provides three additional filters: theme, sentiment and language. Let's see them in action by creating a query to search for any negative news related to sports in either English or French.

sentiment:neg (lang:en or lang:fr) theme:sports Run

Because Repustate is multilingual, you can actually specify a query in one language, and get results in another:

Org.business:Walmart lang:zh Run Get notified ››

That's just a sneak peak at what you can do with Deep Search.
Sign up below to be notified once Deep Search launches.