Named entity recognition is now available in Russian via the Repustate API. Combined with Russian sentiment analysis, customers can now do full text analytics in Russian with Repustate.
Repustate is happy to launch Russian named entity recognition to solutions already available in English & Arabic. But like all languages Russian has its nuances that caused named entity recognition to be a bit tougher than say English.
Consider the following sentence: Путин дал Обаме новую ядерную соглашение
In English, this says: Putin gave Obama the new nuclear agreement
This is how Barack Obama is written in Russian: Барак Обама
Notice that in our sentence, “Obama” is spelled with a different suffix at the end. That’s because in Russian, nouns (even proper nouns), are conjugated based on the their use case. In this example, Obama is being used in what’s called the “dative” case, meaning the noun is the recipient of something. In English, there is no concept of conjugating nouns for this reason. English only requires changing the suffix in the case of pluralization.
So Repustate’s named entity recognition API has to know how to stem proper nouns as well in order to properly identify “Обаме” as Barack Obama during the contextual expansion phase.