O'Reilly logo

The Handbook of News Analytics in Finance by Leela Mitra, Gautam Mitra

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

3.4 A FRAMEWORK FOR REAL-TIME NEWS ANALYTICS

The core of our real-time news analysis engine relies on a scoring method that assesses the relative volume/significance of news from a specific category of news. For instance, we wish to identify periods when the volume of news about foreign exchange markets is abnormally high, or when there is a flurry of macroeconomic news announcements.

For a given topic, say foreign exchange news, the scoring procedure has the following parameters:

  • A list of keywords/key phrases and real-valued weights: ( W1, γ1),…, (Wk, γk).
  • A rolling window size, l (typically about 5–10 minutes).
  • A calibration rolling window size, L (typically about 90 days).

The keywords list and the last l minutes of news are used to create a raw score, and this score is normalized/calibrated using statistics about the news over the last L days (as described below).

3.4.1 Assigning scores to news

The score at a given point in time, t, is assigned as follows: Let (w1,…, wk) be the vector of keyword frequencies in the time interval [t l, t) (i.e., wi is the number of times word/phrase Wi has appeared in the last l minutes). The raw score at time t is then defined to be:

image

In this form, the raw score will tend to be high when news volume is high, and so we calibrate/normalize the score using the calibration rolling window: We maintain a record of the scores that have been assigned ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required