An avalanche of market data comes out every minute in the form of Bloomberg updates, Dow Jones newswire, Barron's and MarketWatch. Even social media now plays a part. How does a hedge fund manager corral all this information into something useful? And what if it's not all in English?

Data mining financial news

The financial markets are more competitive and cut-throat than ever. High frequency trading algorithms seek out any volatilities in the market and exploit them in sub second transactions. Having accurate information at your disposal is critical to making the right call on any given trade.

When the manager of a large hedge fund specializing in the Asia-Pacific region wanted to analyze market data in real-time, he turned to Repustate.

“I've got a mountain of data staring at me. My Bloomberg terminal is lighting up, data coming down the newswire and it's not all in English. News from Chinese news sources is critical to our trading strategies, too. I need to identify the entities that are important to us, determine the sentiment around them, and add that to our models.”

A rose by any other name

One of the main issues in dealing with multilingual financial market data is company names aren't all written the same way. Johnson & Johnson for example is written as 强生公司 in Chinese news sources. So if you're trying to gather market sentiment for J & J, you need to be able to uniquely identify its name, regardless of language. Once you're able to do this reliably, then applying sentiment analysis to the data in the local language yields useful insights.

Armed with Repustate's API and the ability to do multilingual entity extraction, the hedge fund manager had his team build a real-time dashboard based on Repustate's sentiment and semantic analysis of the various news sources. Among other things, market sentiment and share price were compared for various equities and debt instruments. A sample of the type of graph you'd see in this dashboard can be seen below:

A graph like this shows share price compared to market sentiment, both positive and negative, for a particular company.

Industry research

Repustate's semantic analysis also provides this hedge fund with insight into industries in general. Market data can be mined for mention of any companies in a given industry, such as forestry or semi-conductors or aerospace, without having to specify each and every company in each and every industry because Repustate already knows the industries a company is in. The sentiment for that news can also be piped into a financial model to help create a trading strategy.

The entirety of the financial news produced each day, combined with the market sentiments expressed on social media, or forums like Seeking Alpha, can all be mined and categorized instantly with Repustate's API. Throw in political news that can dramatically effect markets (e.g. political unrest in Ukraine leads to oil prices rising), and you being to have a complex network of data easily navigable with Repustate's text analytics.

“Of course it helps to know Mandarin or Arabic or Russian to be able to read news in the native language, but when time is short, I need to know what the rest of the world is thinking now. Running our huge data streams through Repustate lets me visualize exactly what the word on the street is, no matter if that street is in Shanghai or New York. ”