Hack #23. Straighten Smart Quotes

Convert curly quotes, apostrophes, and other fancy typographical symbols back to their ASCII equivalents.

Have you ever gone to copy a block of text from a web site and paste it into a text editor (or try to paste it into a weblog post of your own)? The text comes through, but all the apostrophes and quote marks end up as random-looking symbols. The web site uses fancy publishing software to produce smart quotes and apostrophes, but your text editor doesn't understand them. This hack dumbs down these fancy typographical symbols to their ASCII equivalents.

The Code

This user script runs on all pages. It constructs an array of fancy characters (by their Unicode representation). Then, it gets a list of all the text nodes on the page and executes a search-and-replace on each node to convert each fancy character to a plain-text equivalent.

Tip

Learn more about Unicode at http://www.unicode.org.

In JavaScript, the replace method takes a regular expression object as its first parameter. For performance reasons, we build all our regular expressions first, and then reuse them every time through the loop. If we had used the inline regular expression syntax, Firefox would need to rebuild each regular expression object every time through the loop—a significant performance drain on large pages!

Save the following user script as dumbquotes.user.js:

	// ==UserScript==
	// @name		DumbQuotes // @namespace http://diveintomark.org/projects/greasemonkey/ // @description straighten ...

Get Greasemonkey Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.