Feature extraction

Feature extraction is a very important and valuable step in text mining. A system that can extract features from text has potential to be used in lots of applications. The initial step for feature extraction would be tagging the document; this tagged document is then processed to extract the required entities that are meaningful.

The elements that can be extracted from the text are:

  • Entities: These are some of the pieces of meaningful information that can be found in the document, for example, location, companies, people, and so on
  • Attributes: These are the features of the extracted entities, for example the title of the person, type of organization, and so on
  • Events: These are the activities in which the entities participate, for ...

Get Mastering Text Mining with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.