Deep Search allows you to search any text by the semantic classification of the entities mentioned in your text. This means you can find any mention of a US politician or an Italian soccer player without having to supply keyword lists or train your own language models.
This walkthrough guides you through how Repustate classifies your semantic data and the numerous ways you can query it.Let's start ››
Repustate knows about millions of entities and their properties. All entities are classified into one (or more) of the following types:
|Event||war, natural disaster, sporting event||World War II, 9/11, Hurricane Katrina, Super Bowl 50|
|Health||medical_conditions, drugs, muscle||Polio, Lipitor, quadricep|
|Location||country, city, state, tourist attraction||Toronto, California, Eiffel Tower, Grand Canyon, Sahara Desert|
|Org||business, government, charity, sports team||Apple Inc., Tesla Motors, United Nations, Manchester United, CIA|
|Person||politician, professional athlete, scientist, actor||Barack Obama, Lionel Messi, Meryl Streep, Albert Einstein|
|Product||food, vehicles, games||pizza, Tesla Model 3, chess|
|Science||animal, planet, chemical||elephant, Jupiter, sodium chloride|
|Technology||software, mobile phone, programming language||Microsoft Excel, iPhone X, Java|
|Time||day of the week, month, specific date||Monday, March 15th, yesterday|
As you add text documents to your search index, Repustate will automatically identify all entities, classify them accordingly and store this information in your search index, allowing you to query it instantly.Entities ››
Each classification type has multiple "entities". For example, the Location classification contains, among others, the following entities:
Query terms take the form of Classification.type:value where value is what you're looking for. (You can also use an "*" to mean match anything for this type.) For example, to search for any mention of a location in Canada (or Canada itself), you would use this as your query term:
You can also combine query terms using boolean operators to construct more complex queries:
Location.country:Canada or Location.city:Boston RunEntity Metadata ››
Not all entities are the same so it makes sense that Deep Search allows you to search different entities according to different criteria. Each entity has its own unique set of properties, or metadata, that you can search against. For example, businesses have market cap as a metadata property, while politicians have affiliated political party and nationality.
In total, Repustate tracks over 100 metadata properties across all classified entities. Let's search for mentions of any politicians who are both female and from the US:
Person.politician.nationality:US Person.politician.gender:F Run
Being able to search by metadata allows you to really dig deep into your data and get at just the data you want, no more, no less. Let's find all documents mentioning US companies worth more than 50 billion:
Org.business.market_cap>50B Org.business.headquarters:US Run
Repustate provides API calls to help you discover all possible entities and their associated metadata.Query Structure ››
Deep Search's intuitive query language allows you to combine multiple entity searches into one query to uncover the exact documents you want and no more. Convenient shortcuts allow you to query metadata information in a very short, concise, yet expressive manner.
By default, all query terms are "AND'd" together, meaning a document must satisfy all query predicates in order to be returned in a search. But you can use the "or" keyword to indicate with predicates can optionally match. Parentheses can also be used to construct more complex queries.
Org.business.market_cap>50B (Org.business.headquarters:US or Org.business.headquarters:UK) Run
To search against numerical metadata properties, Deep Search offers some shortcuts to express large numbers:
In the query above, "50B" means 50 billion. When querying financial figures, all figures are assumed to be in USD.Other search filters ››
To further refine your search results, Repustate provides three additional filters: theme, sentiment and language. Let's see them in action by creating a query to search for any negative news related to sports in either English or French.
sentiment:neg (lang:en or lang:fr) theme:sports Run
Because Repustate is multilingual, you can actually specify a query in one language, and get results in another:
Org.business:Walmart lang:zh RunGet notified ››