Lexemes and tokens

A lexeme is a sequence of characters in the source program that matches the pattern for a token. We can say that a token has a pattern and a pattern can be matched by many lexemes, in some cases. As a result, in a programming language, there are an infinite number of potential lexemes and a limited number of tokens.

The easiest way to understand the difference between a lexeme and a token is to take a look at an example, such as the following code snippet:

while (y >= t) y = y - 3; 

The preceding code snippet will be parsed into the following lexemes and tokens:

Lexeme

Token

while

WhileKeyword

(

OpenParenToken

y

Identifier

>=

GreaterThanEqualsToken

t

Identifier

)

CloseParenToken ...

Get Learning TypeScript 2.x - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.