Unicode Text
Unicode text
is text in UTF-16 encoding, as opposed to
string
, which has the
MacRoman encoding. Unicode is the
native system-level encoding of Mac OS X, so text supplied by the
System is often Unicode text rather than a string. For example:
tell application "Finder" to set x to (get name of disk 1)
class of x -- Unicode text
Similarly, some Mac OS X-native applications, such as TextEdit, return text values as Unicode text. Unicode is capable of expressing tens of thousands of characters, and in its fullest form will express about a million, embracing every character of every written language in history. Eventually we may expect that AppleScript will become completely Unicode-savvy; all AppleScript text will be Unicode text, and the old string type will fade into oblivion.
Unicode text is basically indistinguishable from a string; the differences between them are handled transparently. Whatever you can do to a string, you can do to Unicode text. If you get an element of a Unicode text value, the result is Unicode text. If you concatenate Unicode text and a string, the result is Unicode text (though if you concatenate a string and Unicode text, you get a string; this is troublesome and might change in a future version of AppleScript). You can explicitly coerce between a string and Unicode text, and AppleScript implicitly coerces for you as appropriate.
Nevertheless, Unicode text is currently still a second-class citizen in AppleScript, and can be hard to work with. You ...
Get AppleScript: The Definitive Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.