29
Finding Currency Values
Matching currency values in text files as they are ingested is a useful technique. Regular expressions are ideally suited to this.
Here we match a currency value with an optional pence amount:
\£[0-9]+(\.[0-9][0-9])?
The £
symbol implies this must be a Unicode string. The +
operator means we must have at least 1 digit after the £
. The brackets atomize the decimal point and pence value that is optional or instantiated only once because of the ?
operator. We could use the bounds operator like this to mean the same thing:
\£[0-9]+(\.[0-9]{2})?
Get Developing Quality Metadata now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.