N-grams

N-gram is a continuous sequence of N-words or tokens in a given sentence or continuous sequence of text. N is defined as an integer value starting from 1. So, N-Gram could be Uni-Gram(N=1), Bi-Gram(N=3) or Tri-Gram(N=3). N-gram algorithms or programs identify all continuous adjacent sequences of words in a given sentence tokens. It is a Windows-based functionality starting from the left-most word position and then moving windows by one step. Let's see it with an example sentence, This is Big Data AI Book. See the following example of Uni-Gram, Bi-Gram, and Tri-Gram examples:

N-grams is used for developing efficient features that are ...

Get Artificial Intelligence for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.