Clojure code is composed of literal representations of its own data structures and atomic values; this characteristic is formally called homoiconicity, or more casually, code-as-data. This is a significant simplification compared to most other languages, which also happens to enable metaprogramming facilities to a much greater degree than languages that are not homoiconic. To understand why, we’ll need to talk some about languages in general and how their code relates to their internal representations.
Recall that a REPL’s first stage is to read code provided to it by you. Every language has to provide a way to transform that textual representation of code into something that can be compiled and/or evaluated. Most languages do this by parsing that text into an abstract syntax tree (AST). This sounds more complicated than it is: an AST is simply a data structure that represents formally what is manifested concretely in text. For example, Figure 1-1 shows some examples of textual language and possible transformations to their corresponding syntax trees.
These transformations from a textual manifestation of language to an AST are at the heart of how languages are defined, how expressive they are, and how well-suited they are to the purpose of relating to the world within which they are designed to be used. Much of the appeal of domain-specific languages springs from exactly this point: if you have a language that is purpose-built for a given field of use, those that have expertise in that field will find it far easier to define and express what they wish in that language compared to a general-purpose language.
The downside of this approach is that most languages do not provide any way to control their ASTs; the correspondence between their textual syntax and their ASTs is defined solely by the language implementers. This prompts clever programmers to conjure up clever workarounds in order to maximize the expressivity and utility of the textual syntax that they have to work with:
Textual macros and preprocessors (used to legendary effect by C and C++ programmers for decades now)
Compiler plug-ins (as in Scala, Project Lombok for Java, Groovy’s AST transformations, and Template Haskell)
That’s a lot of incidental complexity—complexity introduced solely because language designers often view textual syntax as primary, leaving formal models of it to be implementation-specific (when they’re exposed at all).
Clojure (like all Lisps) takes a different path: rather than
defining a syntax that will be transformed into an AST, Clojure programs
are written using Clojure data structures that represent that AST
directly. Consider the
example from Figure 1-1, and see how a Clojure
transliteration of the example is an AST for it
(recalling the call semantics of function position in Clojure
The fact that Clojure programs are represented as data means that Clojure programs can be used to write and transform other Clojure programs, trivially so. This is the basis for macros—Clojure’s metaprogramming facility—a far different beast than the gloriously painful hack that are C-style macros and other textual preprocessors, and the ultimate escape hatch when expressivity or domain-specific notation is paramount. We explore Clojure macros in Chapter 5.
In practical terms, the direct correspondence between code and data
means that the Clojure code you write in the REPL or in a text source file
isn’t text at all: you are programming using Clojure data structure
literals. Recall the simple
function from Example 1-2:
(defn average [numbers] (/ (apply + numbers) (count numbers)))
This isn’t just a bunch of text that is somehow transformed into a
function definition through the operation of a black box; this is a list
data structure that contains four values: the symbol
defn, the symbol
average, a vector data structure containing the
numbers, and another list that
comprises the function’s body. Evaluating that list data structure is what
defines the function.
 Clojure is by no means the only homoiconic language, nor is homoiconicity a new concept. Other homoiconic languages include all other Lisps, all sorts of machine language (and therefore arguably Assembly language as well), Postscript, XSLT and XQuery, Prolog, R, Factor, Io, and more.