CHAPTER 2 Financial Data Structures

2.1 Motivation

In this chapter we will learn how to work with unstructured financial data, and from that to derive a structured dataset amenable to ML algorithms. In general, you do not want to consume someone else's processed dataset, as the likely outcome will be that you discover what someone else already knows or will figure out soon. Ideally your starting point is a collection of unstructured, raw data that you are going to process in a way that will lead to informative features.

2.2 Essential Types of Financial Data

Financial data comes in many shapes and forms. Table 2.1 shows the four essential types of financial data, ordered from left to right in terms of increasing diversity. Next, we will discuss their different natures and applications.

Table 2.1 The Four Essential Types of Financial Data

Fundamental Data Market Data Analytics Alternative Data
  • Assets
  • Liabilities
  • Sales
  • Costs/earnings
  • Macro variables
  • . . .
  • Price/yield/implied volatility
  • Volume
  • Dividend/coupons
  • Open interest
  • Quotes/cancellations
  • Aggressor side
  • . . .
  • Analyst recommendations
  • Credit ratings
  • Earnings expectations
  • News sentiment
  • . . .
  • Satellite/CCTV images
  • Google searches
  • Twitter/chats
  • Metadata
  • . . .

2.2.1 Fundamental Data

Fundamental data encompasses information that can be found in regulatory filings and business analytics. It is mostly accounting data, reported quarterly. A particular aspect of this data is that it is reported with a lapse. ...

Get Advances in Financial Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.