Community Search
Print Page   |   Contact Us   |   Sign In   |   Register
Community Search
Reading 25 Million Studies in Seconds: Implications for Fighting COVID-19 and Managing a Portfolio
Register Tell a Friend About This EventTell a Friend
Reading 25 Million Studies in Seconds: Implications for Fighting COVID-19 and Managing a Portfolio

 Export to Your Calendar 8/25/2020
When: Tuesday, August 25th
12:00 - 1:00 PM
Contact: Rebecca Harrington

Online registration is available until: 8/25/2020
« Go to Upcoming Event List  




We will show how natural language processing (NLP) models, or neural network models of language, can be used to significantly improve analytics in various fields, including healthcare and finance. Machine learning models of language, arguably, are the most important big data tools for both natural and social sciences in the 21st century. Textual data is by far the fastest growing dataset, even outpacing genome sequencing data. WellAI team has trained language neural networks on the large sets of medical studies, such as 25 million+ articles available on PubMed, or almost 200,000 articles on the novel coronavirus (COVID-19) available through the CORD-19 dataset.




WellAI COVID-19 model [link] allows researchers to quickly elucidate relationships between thousands of concepts from tens of thousands or even millions of studies without having to fit any closed form statistical models or distributions.




The notion of finding relationships from textual data can be applied to portfolio management. It is well known that joint distribution of securities is more important for portfolio risk estimation that marginal distributions. At the same time, the amount of data available in the financial markets is only a few decades long at best.  It does not allow to fit a joint distribution of any complexity. Even for Gaussian distribution, the number of data points available is frequently less than the number of parameters estimated, thus creating an insurmountable problem. However, Gaussian copula is not even the most realistic model of join behavior for financial securities. Estimating joint behavior of power law models or models of similar complexity requires exponentially more data than Gaussian and is thus hopeless.  Fortunately, the wealth of textual data provides researchers with more than enough degrees of freedom to solve this ‘curse of dimensionality’ and to make parameter estimation for portfolio management much more efficient.




While returns data is limited, textual data related to finance is growing exponentially. One important question is whether textual data actually contains useful information about behavior of securities and specifically about their interrelationships. This is an assumption of our approach. It is not a heroic assumption, as numerous analyses of financial securities outside of the quant analytics are done precisely on textual data: e.g. earnings calls, analyst reports, industry analyses, consumer reviews etc. etc. Neural network models of language allow us to estimate mathematical relationships between concepts. Moreover, neural networks implicitly contain joint distributions of almost any complexity. Therefore, neural networks of language give us two things that we are missing in finance: virtually unlimited amount of data and mathematical model that reflects the complexity of joint distributions. They contain everything we need in order to estimate relationships between concepts or variables. Can we make the result of this model work for us in portfolio management? Join us on August 25 at 12pm EST to find out.



Sergei Polevikov, WellAI and SQA Board

Daniel Satchkov, WellAI and RiXtrema


Click here for speaker bios.

Click here for full flyer.

Click here to view official Press Release.


Some webinar content is based on the following paper.

Register here for FREE!

Membership Software  ::  Legal