Accurate prediction of market movements is crucial for success in financial trading. Quantitative factors such as historical prices, economic indicators, and company performance metrics are extremely useful in analysing and predicting market movements. However, incorporating qualitative information into the decision-making process is essential for a more comprehensive understanding of the market. A critical challenge in financial analysis is finding ways to draw useful and impactful insights from qualitative information.
Market sentiment analysis is popularly used to evaluate qualitative information to derive in-depth insights regarding sentiments surrounding an asset. Sentiment analysis can be done across various qualitative information, including social media content, financial news reporting, and customer reviews. By combining sentiment analysis with cutting-edge AI technology, financial traders are able to derive insights from qualitative information at speed.
A leading trading insights platform identified that Twitter and Reddit data are extremely useful for their stock market price prediction tasks. The company reached out to TurinTech to understand the value-add automated machine learning model development and optimisation can bring in drawing insights from social media data.
The company had access to relevant text data scraped from Twitter and Reddit.
State of the existing system
The team had worked on manually developing machine learning models for the sentiment analysis task, but had failed to make any significant progress in deploying a sustainable solution. The team had spent around 3 months to trial a suitable solution.
Bottlenecks to implementation
There were four factors particularly hindering the company in generating reliable sentiment signals from their data.
- Noisy data: Social media data can be inundated with meaningless information. These can occur in the form of content generated from fake accounts/bots, spam etc. Developing a successful model entails finding ways to filter out noise.
- High data volumes: Social media generates large volumes of data which need to be constantly monitored and incorporated into models for more accurate insights. Not having a proper mechanism prevented the company from proceeding with their model development and deployment. For instance, the company’s Twitter text data scraping tools picked up around 1,000 tweets per second, analysing which required a strategic and a robust approach.
- Lack of labelled data: Supervised machine learning requires labelled data to train the models. The company was challenged by the lack of appropriate labelled data to train their models.
- The need for right talent: Building, evaluating, and deploying the pipeline required NLP experts. The company was lacking the right kind of NLP expertise within their data science team to execute this task.
Leveraging evoML’s NLP capacity for improved sentiment analysis
The company was able to use evoML and its NLP capabilities to successfully develop and deploy machine learning models within one week.
✅ evoML functionalities improved the efficiency of making predictions by 5x
evoML brings the entire data science pipeline onto a single platform, and automates the machine learning model development and optimisation process, reducing the time it takes to go from conceptualisation to deployment. With cutting-edge NLP functionalities, evoML helped the company build a reliable model for sentiment analysis for stock market prediction tasks within a week.
evoML was particularly able to help build a set of models that combined stock market data with Twitter data and metadata to better identify correlations and patterns in the market. This led to an accuracy of more than 75%, allowing the client to generate highly reliable outputs. This level of accuracy is significantly higher than the 62% accuracy that the company’s in-house models were able to reach previously. The model also performed 5x faster than any model that the company had built previously without evoML.
Trading signals derived from evoML-generated NLP models also increased trading profitability, helping the overall financial position of the company.
✅ Code ownership gave the company full control over the developed models
A key feature of evoML is that it allows users to download full model code, giving them complete authority to modify code as needed and embed it in their own data science pipelines. This empowered the company to improve its in-house NLP capabilities, rather than solely relying on a third party. With evoML, the data science team of the company had a set of easy-to-use tools that they could easily manipulate to reach the required results.
“In one week, we managed to Improve the accuracy of a model from 62% to 75% and it runs 5x faster.”