Chapter 20. Word Morphology

Khurshid Ahmad

Duncan C. White

Natural Language Processing (NLP) is a branch of Artificial Intelligence involving computer programs that understand or generate documents written in “natural language”—that is, any human language, like English, Hebrew, Swahili, or Xhosa. Creating programs that exhibit full understanding of natural language has long been a goal of AI. Some typical NLP applications might be:

  • Word assistance to users. For example, a human might ask: “What is the adverb form of ‘accident’?,” and the computer might reply: “‘accidentally’ is probably the word you want, although 3% of users spell it ‘accidently’.”

  • A smarter web search engine that lets you search for a keyword such as compute to retrieve all documents that contain that keyword, or conceptually related keywords like computationally.

  • A smart document categorizer that reads a series of documents and sorts them into different categories based upon disproportionate use of particular keywords (“This document appears to be about nuclear physics, because it mentions keywords like atom and nucleus far more than an average document would”).

  • A smart document summarizer that summarizes one or more documents into a more compact form and presents a digest (see “Summarizing Web Pages with HTML::Summary” in Web, Graphics, and Perl/Tk: Best of the Perl Journal).

All of these programs require some understanding of natural language. For perfect summarization or perfect translation, we’d need total understanding, ...

Get Games, Diversions & Perl Culture now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.