Cover by Brian Carper, Christophe Grand, Chas Emerick

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

The Reader

Although Clojure’s compilation and evaluation machinery operates exclusively on Clojure data structures, the practice of programming has not yet progressed beyond storing code as plain text. Thus, a way is needed to produce those data structures from textual code. This task falls to the Clojure reader.

The operation of the reader is completely defined by a single function, read, which reads text content from a character stream[8] and returns the next data structure encoded in the stream’s content. This is what the Clojure REPL uses to read text input; each complete data structure read from that input source is then passed on to be evaluated by the Clojure runtime.

More convenient for exploration’s sake is read-string, a function that does the same thing as read but uses a string argument as its content source:

(read-string "42")
;= 42
(read-string "(+ 1 2)")
;= (+ 1 2)

The operation of the reader is fundamentally one of deserialization. Clojure data structures and other literals have a particular textual representation, which the reader deserializes to the corresponding values and data structures.

You may have noticed that values printed by the Clojure REPL have the same textual representation they do when entered into the REPL: numbers and other atomic literals are printed as you’d expect, lists are delimited by parentheses, vectors by square brackets, and so on. This is because there are duals to the reader’s read and read-string functions: pr and pr-str, which prints to *out*[9] and returns as a string the readable textual representation of Clojure values, respectively. Thus, Clojure data structures and values are trivially serialized and deserialized in a way that is both human- and reader-readable:

(pr-str [1 2 3])
;= "[1 2 3]"
(read-string "[1 2 3]")
;= [1 2 3]


It is common for Clojure applications to use the reader as a general-purpose serialization mechanism where you might otherwise choose XML or serialization or pickling or marshaling, especially in cases where human-readable serializations are desirable.

Scalar Literals

Scalar literals are reader syntax for noncollection values. Many of these are bread-and-butter types that you already know intimately from Java or very similar analogues in Ruby, Python, and other languages; others are specific to Clojure and carry new semantics.


Clojure strings are Java Strings (that is, instances of java.lang.String), and are represented in exactly the same way, delimited by double quotes:

"hello there"
;= "hello there"

Clojure’s strings are naturally multiline-capable, without any special syntax (as in, for example, Python):

"multiline strings
are very handy"
;= "multiline strings\nare very handy"


The tokens true and false are used to denote literal Boolean values in Clojure, just as in Java, Ruby, and Python (modulo the latter’s capitalization).


nil in Clojure corresponds to null in Java, nil in Ruby, and None in Python. nil is also logically false in Clojure conditionals, as it is in Ruby and Python.


Character literals are denoted by a backslash:

(class \c)
;= java.lang.Character

Both Unicode and octal representations of characters may be used with corresponding prefixes:

;= \ÿ
;= \!

Additionally, there are a number of special named character literals for cases where the character in question is commonly used but prints as whitespace:

  • \space

  • \newline

  • \formfeed

  • \return

  • \backspace

  • \tab


Keywords evaluate to themselves, and are often used as accessors for the values they name in Clojure collections and types, such as hash maps and records:

(def person {:name "Sandra Cruz"
             :city "Portland, ME"})
;= #'user/person
(:city person)
;= "Portland, ME"

Here we create a hashmap with two slots, :name and :city, and then look up the value of :city in that map. This works because keywords are functions that look themselves up in collections passed to them.

Syntactically, keywords are always prefixed with a colon, and can otherwise consist of any nonwhitespace character. A slash character (/) denotes a namespaced keyword, while a keyword prefixed with two colons (::) is expanded by the reader to a namespaced keyword in the current namespace—or another namespace if the keyword started by a namespace alias, ::alias/kw for example. These have similar usage and motivation as namespaced entities in XML; that is, being able to use the same name for values with different semantics or roles:[10]

(def pizza {:name "Ramunto's"
            :location "Claremont, NH"
            ::location "43.3734,-72.3365"})
;= #'user/pizza
;= {:name "Ramunto's", :location "Claremont, NH", :user/location "43.3734,-72.3365"}
(:user/location pizza)
;= "43.3734,-72.3365"

This allows different modules in the same application and disparate groups within the same organization to safely lay claim to particular names, without complex domain modeling or conventions like underscored prefixes for conflicting names.

Keywords are one type of “named” values, so called because they have an intrinsic name that is accessible using the name function and an optional namespace accessible using namespace:

(name :user/location)
;= "location"
(namespace :user/location)
;= "user"
(namespace :location)
;= nil

The other named type of value is the symbol.


Like keywords, symbols are identifiers, but they evaluate to values in the Clojure runtime they name. These values include those held by vars (which are named storage locations used to hold functions and other values), Java classes, local references, and so on. Thinking back to our original example in Example 1-2:

(average [60 80 100 400])
;= 160

average here is a symbol, referring to the function held in the var named average.

Symbols must begin with a non-numeric character, and can contain *, +, !, -, _, and ? in addition to any alphanumeric characters. Symbols that contain a slash (/) denote a namespaced symbol and will evaluate to the named value in the specified namespace. The evaluation of symbols to the entity they name depends upon their context and the namespaces available within that context. We talk about the semantics of namespaces and symbol evaluation extensively in Namespaces.


Clojure provides a plethora of numeric literals (see Table 1-2). Many of them are pedestrian, but others are rare to find in a general-purpose programming language and can simplify the implementation of certain algorithms—especially in cases where the algorithms are defined in terms of particular numeric representations (octal, binary, rational numbers, and scientific notation).


While the Java runtime defines a particular range of numeric primitives, and Clojure supports interoperability with those primitives, Clojure has a bias toward longs and doubles at the expense of other widths, including bytes, shorts, ints, and floats. This means that these smaller primitives will be produced as needed from literals or runtime values for interop operations (such as calling Java methods), but pure-Clojure operations will default to using the wider numeric representations.

For the vast majority of programming domains, you don’t need to worry about this. If you are doing work where mathematical precision and other related topics is important, please refer to Chapter 11 for a comprehensive discussion of Clojure’s treatment of operations on primitives and other math topics.

Table 1-2. Clojure numeric literals

Literal syntaxNumeric type

42, 0xff, 2r111, 040

long (64-bit signed integer)

3.14, 6.0221415e23

double (64-bit IEEE floating point decimal)


clojure.lang.BigInt (arbitrary-precision integer[a])


java.math.BigDecimal (arbitrary-precision signed floating point decimal)



[a] clojure.lang.BigInt is automatically coerced to java.math.BigInteger when needed. Again, please see Chapter 11 for the in-depth details of Clojure’s treatment of numerics.

Any numeric literal can be negated by prefixing it with a dash (-).

Let’s take a quick look at the more interesting numeric literals:

Hexadecimal notation

Just as in most languages, Clojure supports typical hexadecimal notation for integer values; 0xff is 255, 0xd055 is 53333, and so on.

Octal notation

Literals starting with a zero are interpreted as octal numbers. For example, the octal 040 is 32 in the usual base-10 notation.

Flexible numeral bases

You can specify the base of an integer in a prefix BrN, where N is the digits that represent the desired number, and B is the base or radix by which N should be interpreted. So we can use a prefix of 2r for binary integers (2r111 is 7), 16r for hexadecimal (16rff is 255), and so on. This is supported up to base 36.[11]

Arbitrary-precision numbers

Any numeric literal (except for rational numbers) can be specified as arbitrary-precision by suffixing it appropriately; decimals with an M, integers with an N. Please see Bounded Versus Arbitrary Precision for a full exploration of why and when this is relevant.

Rational numbers

Clojure directly supports rational numbers, also called ratios, as literals in the reader as well as throughout its numeric operators. Rational number literals must always be two integers separated by a slash (/).

For a full discussion of rational numbers in Clojure and how they interact with the rest of Clojure’s numerical model, please see Rationals.

Regular expressions

The Clojure reader treats strings prefixed with a hash character as regular expression (regex) literals:

(class #"(p|h)ail")
;= java.util.regex.Pattern

This is exactly equivalent to Ruby’s /.../ regex syntax, with a minor difference of pattern delimiters. In fact, Ruby and Clojure are very similar in their handling of regular expressions:

# Ruby
>> "foo bar".match(/(...) (...)/).to_a
["foo bar", "foo", "bar"]

;; Clojure
(re-seq #"(...) (...)" "foo bar")
;= (["foo bar" "foo" "bar"])

Clojure’s regex syntax does not require escaping of backslashes as required in Java:

(re-seq #"(\d+)-(\d+)" "1-3")     ;; would be "(\\d+)-(\\d+)" in Java
;= (["1-3" "1" "3"])

The instances of java.util.regex.Pattern that Clojure regex literals yield are entirely equivalent to those you might create within Java, and therefore use the generally excellent java.util.regex regular expression implementation.[12] Thus, you can use those Pattern instances directly via Clojure’s Java interop if you like, though you will likely find Clojure’s related utility functions (such as re-seq, re-find, re-matches, and others in the clojure.string namespace) simpler and more pleasant to use.


There are two comment types that are defined by the reader:

  • Single-line comments are indicated by prefixing the comment with a semicolon (;); all content following a semicolon is ignored entirely. These are equivalent to // in Java and JavaScript, and # in Ruby and Python.

  • Form-level are available using the #_ reader macro. This cues the reader to elide the next Clojure form following the macro:

(read-string "(+ 1 2 #_(* 2 2) 8)")
;= (+ 1 2 8)

What would have been a list with four numbers—(+ 1 2 4 8)—yields a list of only three numbers because the entire multiplication form was ignored due to the #_ prefix.

Because Clojure code is defined using data structure literals, this comment form can be far more useful in certain cases than purely textual comments that affect lines or character offsets (such as the /* */ multiline comments in Java and JavaScript). For example, consider the time-tested debugging technique of printing to stdout:

(defn some-function
  (if …debug-conditional…
    (println …debug-info…)
    (println …more-debug-info…))

Making those println forms functionally disappear is as easy as prefixing the if form with the #_ reader macro and reloading the function definition; whether the form spans one or a hundred lines is irrelevant.


There is only one other way to comment code in Clojure, the comment macro:

(when true
  (comment (println "hello")))
;= nil

comment forms can contain any amount of ignored code, but they are not elided from the reader’s output in the way that #_ impacts the forms following it. Thus, comment forms always evaluate to nil. This often is not a problem; but, sometimes it can be inconvenient. Consider a reformulation of our first #_ example:

(+ 1 2 (comment (* 2 2)) 8)
;= #<NullPointerException java.lang.NullPointerException>

That fails because comment returns nil, which is not a valid argument to +.

Whitespace and Commas

You may have noticed that there have been no commas between forms, parameters to function calls, elements in data structure literals, and so on:

(defn silly-adder
  [x y]
  (+ x y))

This is because whitespace is sufficient to separate values and forms provided to the reader. In addition, commas are considered whitespace by the reader. For example, this is functionally equivalent to the snippet above:

(defn silly-adder
  [x, y]
  (+, x, y))

And to be slightly pedantic about it:

(= [1 2 3] [1, 2, 3])
;= true

Whether you use commas or not is entirely a question of personal style and preference. That said, they are generally used only when doing so enhances the human readability of the code in question. This is most common in cases where pairs of values are listed, but more than one pair appears per line:[13]

(create-user {:name new-username, :email email})

Collection Literals

The reader provides syntax for the most commonplace Clojure data structures:

'(a b :name 12.5)       ;; list

['a 'b :name 12.5]      ;; vector

{:name "Chas" :age 31}  ;; map

#{1 2 3}                ;; set

Since lists are used to denote calls in Clojure, you need to quote (') the list literal in order to prevent the evaluation of the list as a call.

The specifics of these data structures are explored in detail in Chapter 3.

Miscellaneous Reader Sugar

The reader provides for some additional syntax in certain cases to improve concision or regularity with other aspects of Clojure:

  • Evaluation can be suppressed by prefixing a form with a quote character ('); see Suppressing Evaluation: quote.

  • Anonymous function literals can be defined very concisely using the #() notation; see Function literals.

  • While symbols evaluate to the values held by vars, vars themselves can be referred to by prefixing a symbol with #'; see Referring to Vars: var.

  • Instances of reference types can be dereferenced (yielding the value contained within the reference object) by prefixing @ to a symbol naming the instance; see Clojure Reference Types.

  • The reader provides three bits of special syntax for macros: `, ~, and ~@. Macros are explored in Chapter 5.

  • While there are technically only two Java interop forms, the reader provides some sugar for interop that expands into those two special forms; see Java Interop: . and new.

  • All of Clojure’s data structures and reference types support metadata—small bits of information that can be associated with a value or reference that do not affect things like equality comparisons. While your applications can use metadata for many purposes, metadata is used in Clojure itself where you might otherwise use keywords in other languages (e.g., to indicate that a function is namespace-private, or to indicate the type of a value or return type of a function). The reader allows you to attach metadata to literal values being read using the ^ notation; see Metadata.

[8] Technically, read requires a as an implementation detail.

[9] *out* defaults to stdout, but can be redirected easily. See Building a Primitive Logging System with Composable Higher-Order Functions for an example.

[10] Namespaced keywords are also used prominently with multimethods and isa? hierarchies, discussed in depth in Chapter 7.

[11] The implementation limit of java.math.BigInteger’s radix support. Note that even though BigInteger is used for parsing these literals, the concrete type of the number as emitted by the reader is consistent with other Clojure integer literals: either a long or a big integer if the number specified requires arbitrary precision to represent.

[12] See the java.util.regex.Pattern javadoc for a full specification of what forms the Java regular expression implementation supports:

[13] Questions of style are notoriously difficult to answer in absolutes, but it would be very rare to see more than two or three pairs of values on the same line of text in any map literal, set of keyword arguments, and so on. Further, some forms that expect pairs of values (such as bindings in let) are essentially always delimited by linebreaks rather than being situated on the same line.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required