29

image Finding Currency Values

Matching currency values in text files as they are ingested is a useful technique. Regular expressions are ideally suited to this.

Here we match a currency value with an optional pence amount:

\£[0-9]+(\.[0-9][0-9])?

The £ symbol implies this must be a Unicode string. The + operator means we must have at least 1 digit after the £. The brackets atomize the decimal point and pence value that is optional or instantiated only once because of the ? operator. We could use the bounds operator like this to mean the same thing:

\£[0-9]+(\.[0-9]{2})?

Get Developing Quality Metadata now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.