2.5 DISCUSSION

The various techniques and metrics fall into two broad categories: supervised and unsupervised learning methods. Supervised models use machine-learning algorithms based on well-specified input variables. One may think of this as a generalized regression model. In unsupervised learning, there are no explicit input variables but latent ones (e.g., cluster analysis). Most of the news analytics we explored relate to supervised learning, such as the various classification algorithms. This is well-trodden research. It is the domain of unsupervised learning; for example, the community detection algorithms and centrality computation that have been less explored and are potentially areas of greatest potential going forward.

Classifying news to generate sentiment indicators has been well worked out. This is epitomized in many of the chapters in this book. It is the networks on which financial information gets transmitted that have been much less studied, and where I anticipate most of the growth in news analytics to come from. For example, how quickly does good news about a tech company proliferate to other companies? We looked at issues like this in Das and Sisk (2005), discussed earlier, where we assessed whether knowledge of the network might be exploited profitably. Information also travels by word of mouth and these information networks are also open for much further examination (see Godes et al., 2005). Inside (not insider) information is also transmitted in venture ...

Get The Handbook of News Analytics in Finance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.