Chapter 1. File Operations

If the most essential operation in a build is to compile code, then surely the second most essential operation is to copy files. A real-world build routinely copies files from place to place, recursing directory trees, pattern-maching filenames, and performing string operations on file content. Gradle exposes a few methods, task types, and interfaces to make file operations flexible and easy. Collectively, these form Gradle’s file API.

To explore the file API, we’ll start with practical examples of how to use the Copy task. We’ll move from there to an exploration of the file-related methods of the Project object, which are available to you anywhere inside a Gradle build. In the process, we’ll learn about the FileCollection interface. Finally, we’ll look at the ways the file API is used by common Gradle plug-ins—giving us a richer view of otherwise taken-for-granted structures like JAR files and SourceSets.

Copy Task

The Copy task is a task type provided by core Gradle. At execution, a copy task copies files into a destination directory from one or more sources, optionally transforming files as it copies. You tell the copy task where to get files, where to put them, and how to filter them through a configuration block. The simplest copy task configuration looks like Example 1-1.

Example 1-1. A trivial copy task configuration

task copyPoems(type: Copy) {
  from 'text-files'
  into 'build/poems'
}

The example assumes there is a directory called text-files containing the text of some poems. Running the script with gradle copyPoems puts those files into the build/poems directory, ready for processing by some subsequent step in the build.

By default, all files in the from directory are included in the copy operation. You can change this by specifying patterns to include or patterns to exclude. Inclusion and exclusion patterns use Ant-style globbing, where ** will recursively match any subdirectory name, and * will match any part of a filename. Include calls are exclusive by default; that is, they assume that all files not named in the include pattern should be excluded. Similarly, exclude calls are inclusive by default—they assume that all files not named in the exclude pattern should be included by default.

When an exclude is applied along with an include, Gradle prioritizes the exclude. It collects all of the files indicated by the include, then removes all of the files indicated by the exclude. As a result, your include and exclude logic should prefer more inclusive include patterns which are then limited by less inclusive exclude patterns.

If you can’t express your include or exclude rules in a single pattern, you can call exclude or include multiple times in a single Copy task configuration (Example 1-2). You can also pass a comma-separated list of patterns to a single method call (Example 1-3).

Example 1-2. A copy task that copies all the poems except the one by Henley

task copyPoems(type: Copy) {
  from 'text-files'
  into 'build/poems'
  exclude '**/*henley*'
}

Example 1-3. A copy task that only copies Shakespeare and Shelley

task copyPoems(type: Copy) {
  from 'text-files'
  into 'build/poems'
  include '**/sh*.txt'
}

A common build operation is to gather files from several source directories and copy them into one place. To do this, simply include more than one from configuration, as in Example 1-4. Each call to from can even have its own sets of inclusions and exclusions if needed.

Example 1-4. A copy task taking files from more than one source directory

task complexCopy(type: Copy) {
  from('src/main/templates') {
    include '**/*.gtpl'
  }
  from('i18n')
  from('config') {
    exclude 'Development*.groovy'
  }
  into 'build/resources'
}

Transforming Directory Structure

The outcome of Example 1-4 is to put all source files into one flat directory, build/resources. Of course you may not want to flatten all of the source directories; you might instead want to preserve some of the structure of the source directory trees or even map the source directories onto a new tree. To do this, we can simply add additional calls to into inside the from configuration closures. This is shown in Example 1-5.

Example 1-5. A copy task mapping source directory structure onto a new destination structure

task complexCopy(type: Copy) {
  from('src/main/templates') {
    include '**/*.gtpl'
    into 'templates'
  }
  from('i18n')
  from('config') {
    exclude 'Development*.groovy'
    into 'config'
  }
  into 'build/resources'
}

Note that a top-level call to into is still required—the build file will not run without it—and the nested calls to into are all relative to the path of that top-level configuration.

If the number of files or the size of the files being copied is large, then a copy task could be an expensive build operation at execution time. Gradle’s incremental build feature helps reduce the force of this problem. Gradle will automatically incur the full execution time burden on the first run of the build, but will keep subsequent build times down when redundant copying is not necessary.

Renaming Files During Copy

If your build has to copy files around, there’s a good chance it will have to rename files in the process. Filenames might need to be tagged to indicate a deployment environment, or might need to be renamed to some standard form from an environment-specific origin, or might be renamed according to a product configuration specified by the build. Whatever the reason, Gradle gives you two flexible ways to get the job done: regular expressions and Groovy closures.

To rename files using regular expressions, we can simply provide a source regex and a destination filename. The source regex will use groups to capture the parts of the filename that should be carried over from the source to the destination. These groups are expressed in the destination filename with the $1/$2 format. For example, to copy some configuration-specific files from a source to a staging directory, see Example 1-6.

Example 1-6. Renaming files using regular expressions

task rename(type: Copy) {
  from 'source'
  into 'dest'
  rename(/file-template-(\d+)/, 'production-file-$1.txt')
}

To rename files programmatically, we can pass a closure to the rename method (Example 1-7). The closure takes a single parameter, which is the name of the original file. The return value of the closure is the name of the renamed file.

Example 1-7. Renaming files programmatically

task rename(type: Copy) {
  from 'source'
  into 'dest'
  rename { fileName ->
    "production-file${(fileName - 'file-template')}"
  }
}

Note

In Groovy, subtracting one string from another string removes the first occurence of the second string from the first. So, 'one two one four' - 'one' will return 'two one four'. This is a quick way to perform a common kind of string processing.

Filtering and Transforming Files

Often the task of a build is not just to copy and rename files, but to perform transformations on the content of the copied files. Gradle has three principal ways of doing this job: the expand() method, the filter() method, and the eachFile() method. We’ll consider these in turn.

Keyword Expansion

A common build use case is to copy a set of configuration files into a staging area and to replace some strings in the files as they’re copied. A particular configuration file may contain a substantial set of parameters that do not vary by deployment environment, plus a smaller set of parameters that do. As this configuration file is staged from its working directory into the build directory, it would be convenient to replace the deployment-variable strings as a part of the copy. The expand() method is how Gradle does this.

The expand() method takes advantage of the Groovy SimpleTemplateEngine class. SimpleTemplateEngine adds a keyword substitution syntax to text files similar to the syntax of Groovy string interpolation. Any string inside curly braces preceded by a dollar sign (${string}) is a candidate for substitution. When declaring keyword expansion in a copy task, you must pass a map to the expand() method (Example 1-8). The keys in the map will be matched to the expressions inside curly braces in the copied file, which will be replaced with the map’s corresponding values.

Example 1-8. Copying a file with keyword expansion

versionId = '1.6'

task copyProductionConfig(type: Copy) {
  from 'source'
  include 'config.properties'
  into 'build/war/WEB-INF/config'
  expand([
    databaseHostname: 'db.company.com',
    version: versionId,
    buildNumber: (int)(Math.random() * 1000),
    date: new Date()
  ])
}

Note

SimpleTemplateEngine has some other features that are exposed to you when you use the expand() method inside a copy task. Consult the online documentation for more details.

Note that the expression passed to the expand() method is a Groovy map literal—it is enclosed by square brackets, and a series of key/value pairs are delimited by commas, with the key and the value themselves separated by colons. In this example, the task doing the expanding is dedicated to preparing a configuration file for the production configuration, so the map can be expressed as a literal. A real-world build may opt for this approach, or may reach out to other, environment-specific config file fragments for the map data. Ultimately, the map passed to expand() can come from anywhere. The fact that the Gradle build file is executable Groovy code gives you nearly unlimited flexibility in deciding on its origin.

It’s helpful in this case to take a look at the source file, so we can directly see where the string substitution is happening.

Here’s the source file before the copy-with-filter operation: 

#
# Application configuration file
#
hostname: ${databaseHostname}
appVersion: ${version}
locale: en_us
initialConnections: 10
transferThrottle: 5400
queueTimeout: 30000
buildNumber: ${buildNumber}
buildDate: ${date.format("yyyyMMdd'T'HHmmssZ")}

Here’s the destination file after the copy-with-filter operation: 

#
# Application configuration file
#
hostname: db.company.com
appVersion: 1.6
locale: en_us
initialConnections: 10
transferThrottle: 5400
queueTimeout: 30000
buildNumber: 77
buildDate 20120105T162959-0700

Filtering Line by Line

The expand() method is perfect for general-purpose string substitution—and even some lightweight elaborations on that pattern—but some file transformations might need to process every line of a file individually as it is copied. This is where the filter() method is useful.

The filter() method has two forms: one that takes a closure, and one that takes a class. We’ll look at both, starting with the simpler closure form.

When you pass a closure to filter(), that closure will be called for every line of the filtered file. The closure should perform whatever processing is necessary, then return the filtered value of the line. For example, to convert Markdown text to HTML using MarkdownJ, see Example 1-9.

Example 1-9. Use filter() with a closure to transform a text file as it is copied

import com.petebevin.markdown.MarkdownProcessor

buildscript {
  repositories {
    mavenRepo url: 'http://scala-tools.org/repo-releases'
  }

  dependencies {
    classpath 'org.markdownj:markdownj:0.3.0-1.0.2b4'
  }
}

task markdown(type: Copy) {
  def markdownProcessor = new MarkdownProcessor()
  into 'build/poems'
  from 'source'
  include 'todo.md'
  rename { it - '.md' + '.html'}
  filter { line ->
    markdownProcessor.markdown(line)
  }
}

The source file processed by the example code is a short poem with some comments added. It looks like the following:

# A poem by William Carlos Williams
so much depends
upon
# He wrote free verse
a red wheel
barrow
# In the imageist tradition
glazed with rain
water
# And liked chickens
beside the white
chickens

The copied and filtered file has blank lines instead of comments:

so much depends
upon

a red wheel
barrow

glazed with rain
water

beside the white
chickens

Gradle gives you great flexibility in per-file filtering logic, but true to its form, it wants to give you tools to keep all of that filter logic out of task definitions. Rather than clutter your build with lots of filter logic, it would be better to migrate that logic into classes of its own, which can eventually migrate out of the build into individual source files with their own automated tests. Let’s take the first step in that direction.

Instead of passing a closure to the filter() method, we can pass a class instead. The class must be an instance of java.io.FilterReader. The Ant API provides a rich set of pre-written FilterReader implementations, which Gradle users are encouraged to re-use. The code shown in Example 1-9 could be rewritten as in Example 1-10.

Example 1-10. Use filter() with a custom Ant Filter class to transform a text file as it is copied

import org.apache.tools.ant.filters.*
import com.petebevin.markdown.MarkdownProcessor

buildscript {
  repositories {
    mavenRepo url: 'http://scala-tools.org/repo-releases'
  }

  dependencies {
    classpath 'org.markdownj:markdownj:0.3.0-1.0.2b4'
  }
}

class MarkdownFilter extends FilterReader {
  MarkdownFilter(Reader input) {
    super(new StringReader(new MarkdownProcessor().markdown(input.text)))
  }
}

task copyPoem(type: Copy) {
  into 'build/poems'
  from 'source'
  include 'todo.md'
  rename { it - ~/\.md$/ + '.html'}
  filter MarkdownFilter
}

Eventually the MarkdownFilter class could move out of the build entirely and into a custom plug-in. That is an important topic with a chapter of its own.

Filtering File by File

The expand() and filter() methods apply the same transformation function to all of the files in the copy scope, but some transformation logic might want to consider each file individually. To handle this case, we have the eachFile() method.

The eachFile() method accepts a closure, which is executed for every file as it is copied. That closure takes a single parameter, which is an instance of the FileCopyDetails interface. FileCopyDetails allows you to consider the contents of the copied files one at time. FileCopyDetails exposes methods that allow you to rename the file, change its destination path during the copy, exclude it from the copy operation, create duplicate copies at other paths, and interact with the file programmatically as an instance of java.io.File. You can do many of the same things through the Gradle DSL as described previously, but you might prefer in some cases to drop back to direct manipulation. For example, perhaps you have a custom deployment process that copies a directory full of files and accumulates a SHA1 hash of all the file contents, emitting the hash into the destination directory. You might implement that part of the build as in Example 1-11.

Example 1-11. Use eachFile() to accumulate a hash of several files

import java.security.MessageDigest;

task copyAndHash(type: Copy) {
  MessageDigest sha1 = MessageDigest.getInstance("SHA-1");

  into 'build/deploy'
  from 'source'
  eachFile { fileCopyDetails ->
    sha1.digest(fileCopyDetails.file.bytes)
  }
  doLast {
    Formatter hexHash = new Formatter()
    sha1.digest().each { b -> hexHash.format('%02x', b) }
    println hexHash
  }
}

The File Methods

There are several methods for operating on files that are available in all Gradle builds. These are methods of the Project object, which means you can call them from inside any configuration block or task action in a build. There are convenience methods for converting paths into project-relative java.io.File objects, making collections of files, and recursively turning directory trees into file collections. We’ll explore each one of them in turn.

file()

The file() method creates a java.io.File object, converting the project-relative path to an absolute path. The file() method takes a single argument, which can be a string, a file, a URL, or a closure whose return value is any of those three.

The file() method is useful when a task has a parameter of type File. For example, the Java plug-in provides a task called jar, which builds a JAR file containing the default sourceSet’s class files and resources. The task puts the JAR file in a default location under the build directory, but a certain build might want to override that default. The Jar task has a property called destinationDir for changing this, which one might assume works as in Example 1-12.

Example 1-12. Trying to set the destinationDir property of a Jar task with a string

jar {
  destinationDir = 'build/jar'
}

However, this build will fail, because destinationDir is of type File, not String. A pedantic solution might look like Example 1-13.

Example 1-13. Trying to set the destinationDir property of a Jar task with a string

jar {
  destinationDir = new File('build/jar')
}

This build will run, but will not behave predictably in all cases. The File constructor will create an absolute path out of the supplied parameter, but the constructor argument will be considered as if it is relative to the directory in which the JVM started up.[1] This directory may change if you are invoking Gradle directly, through the wrapper, through an IDE, or through integration with a Continuous Integration server. The correct solution is to use the file() method, as in Example 1-14.

Example 1-14. Setting the destinationDir property of a Jar task using the file() method

jar {
  destinationDir = file('build/jar')
}

This build results in the destinationDir property being set to the build/jar directory under the build’s project root. This is the expected behavior, and will work regardless of how Gradle is being invoked.

If you already have a File object, the file() method will attempt to convert it into a project-relative path in the same way. The construction new File('build/jar') has no defined parent directory, so file(new File('build/jar')) will force its parent to the build’s project root directory. This example shows an awkward construction—real code would likely omit the inner File construction—but file()’s operation on File objects works as the example shows. You might use this case if you already had the File object lying around for some reason.

file() can also operate on java.net.URL and java.net.URI objects whose protocol or scheme is file://. File URLs are not a common case, but they often show up when resources are being loaded through the ClassLoader. If you happen to encounter a file URL in your build, you can easily convert it to a project-relative File object with the file() method.

files()

The files() method returns a collection of files based on the supplied parameters. It is like file() in that it attempts to produce project-relative absolute paths in the File objects it creates, but it differs in that it operates on collections of files. It takes a variety of different parameter types as inputs, as shown in Table 1-1.

Table 1-1. The parameters accepted by files()

Parameter typeMethod behavior

String

Creates a collection containing a single, project-relative file. Resolves filenames just like file().

java.io.File

Creates a collection containing a single, project-relative file. Resolves File objects just like file().

java.net.URL or java.net.URI

Creates a collection of the indicated file. Supports only file:// URLs, just like file().

Collection, Iterable, or Array

Creates a file collection containing all of the named files. Collection elements are recursively resolved, so they may contain any of the other datatypes allowed by files().

EXAMPLES:

// A Groovy ArrayList literal

files(['src/main/groovy','src/test/groovy'])

// listFiles() returns an array of File objects

files(file('src/changelog/resources').listFiles())

Task

Produces a file collection of the task’s outputs. Output file sets are defined on a per-task basis. Tasks provided by core plug-ins typically have implicit output file sets.

EXAMPLE (using the Java plug-in):

// Evaluates to the Java compiler's output directory

files(compileJava)

Task Outputs

Behaves the same as a task name, but allows the TaskOutputs object to be named explicitly.

EXAMPLE (using the Java plug-in):

// Evaluates to the Java compiler's output directory

files(compileJava.outputs)

As you can see, files() is an incredibly versatile method for creating a collection of files. It can take filenames, file objects, file URLs, Gradle tasks, or Java collections containing any of the same.

Beginning Gradle developers often expect the return type of the method to be a trivial collection class that implements List. It turns out that files() returns a FileCollection, which is a foundational interface for file programming in Gradle. We will turn our attention to the section on file collections.

fileTree()

The file() method is an effective way to turn paths into files, and the files() method builds on this to build lists of of files that can be managed as collections. But what about when you want to traverse a directory tree and collect all the files you find, and work with those as a collection? This is a job for the fileTree() method.

There are three ways to invoke fileTree(), and each of them borrows heavily from the configuration of the copy task. They all have several features in common: the method must be pointed to a root directory to traverse, and may optionally be given patterns to include or exclude.

The simplest use of fileTree() simply points it at a parent directory, allowing it to recurse through all subdirectories and add all of the files it finds into the resulting file collection. For example, if you wanted to create a collection of all of the production source files in a Java project, the expression fileTree('src/main/java') would get the job done.

Alternatively, you might want to perform some simple filtering to include some files and exclude others. Suppose, for example, that you knew that some backup files with the ~ extension were likely to exist in the source files. Furthermore, you knew some XML files were mixed in with the source files (rather than placed in the resources source set where they belong), and you wanted to focus on that XML. You could create file collections to deal with both of the cases shown in Example 1-15.

Example 1-15. Using fileTree() with includes and excludes

def noBackups = fileTree('src/main/java') {
  exclude '**/*~'
}

def xmlFilesOnly = fileTree('src/main/java') {
  include '**/*.xml'
}

Alternatively, the directory and the include and exclude patterns can be provided in map form, as shown in Example 1-16. The use of this syntax in place of the closure configuration syntax is a matter of style.

Example 1-16. Using fileTree() with includes and excludes given in a map literal

def noBackups = fileTree(dir: 'src/main/java', excludes: ['**/*~'])
def xmlFilesOnly = fileTree(dir: 'src/main/java', includes: ['**/*.xml'])

The FileCollection Interface

If you tried running the examples in the the section on the files method and you poked around at them just a little bit, you may have noticed that the return value of the files() and fileTree() methods don’t have a very friendly default toString() implementation (Example 1-17). If they were simply ArrayLists as intuition would suggest, we would expect a dump of their contents. The fact that we don’t see this is a hint to something useful going on in the Gradle API (Example 1-18).

Example 1-17. The default toString() implementation of a FileCollection

task copyPoems(type: Copy) {
  from 'text-files'
  into 'build/poems'
}

println "NOT HELPFUL:"
println files(copyPoems)

Here’s the output of the previous build, showing default toString() behavior: 

$ gradle -b file-collection-not-helpful.gradle
NOT HELPFUL:
file collection

Example 1-18. A more useful way to look at a FileCollection

task copyPoems(type: Copy) {
  from 'text-files'
  into 'build/poems'
}

println "HELPFUL:"
println files(copyPoems).files

Here’s the output of the previous build, showing the contents of the FileCollection: 

$ gradle -b file-collection-helpful.gradle
HELPFUL:
[~/oreilly-gradle-book-examples/file-operations-lab/build/poems]

The object returned by the files() method is not a List, but a FileCollection.[2] The FileCollection type shows up in many other places in Gradle: in the contents of SourceSet objects, in task inputs and outputs, in Java classpaths, in dependency configurations, and more. Knowing the essential methods of the type goes a long way in equipping you to program file operations effectively, whether the files are transitive JAR dependencies fetched from a Maven repository, source files in the build of a Groovy project, or static resources in a web application. We will explore the key operations supported by the interface here.

To illustrate these features, we’ll start with a common build file that sets up some interesting collections of files (Example 1-19). We’ll add a task at a time to this example build to see each feature.

Example 1-19. The base build from which we will derive FileCollection examples

apply plugin: 'java'

repositories {
  mavenCentral()
}

dependencies {
  compile 'org.springframework:spring-context:3.1.1.RELEASE'
}

Converting to a Set

We’ve already seen the files property of FileCollection. It returns an object of type Set<File> containing all of the files or directories in the file collection. To list all of the source files in the previous project, we might add the task seen in Example 1-20.

Example 1-20. A naive way to list source files

task naiveFileLister {
  doLast {
    println fileTree('src/main/java').files
  }
}

Here’s the result of the naiveFileLister task: 

$ gradle nFL
:naiveFileLister
[~/file-collection-lab/src/main/java/org/gradle/example/ConsoleContentSink.java,
~/file-collection-lab/src/main/java/org/gradle/example/Content.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentFactory.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentRegistry.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentSink.java,
~/file-collection-lab/src/main/java/org/gradle/example/
  DefaultContentFactory.java,
~/file-collection-lab/src/main/java/org/gradle/example/DonneContent.java,
~/file-collection-lab/src/main/java/org/gradle/example/PoetryEmitter.java,
~/file-collection-lab/src/main/java/org/gradle/example/ShakespeareContent.java]

BUILD SUCCESSFUL

Converting to a Path String

A build can manipulate collections of files for various purposes, which sometimes include using the collection with an operating system command that expects a list of files. Internally, the core Java plug-in does this with compile-time dependencies when executing the javac compiler (Example 1-21). The Java compiler has a command-line switch for specifying the classpath, and that switch must be provided with an operating-specific string. The asPath property converts a FileCollection into this OS-specific string.

Example 1-21. Printing out all of the compile-time dependencies of the build as a path-like string

println configurations.compile.asPath

Here’s the results of the build with the previous addition: 

$ gradle

~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-context/3.1.1.RELEASE/jar/ecb0784a0712c1bfbc1c2018eeef6776861300e4/spring-context-3.1.1.RELEASE.jar:
~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-asm/3.1.1.RELEASE/jar/8717ad8947fcada5c55da89eb474bf053c30e57/spring-asm-3.1.1.RELEASE.jar:
~/.gradle/caches/artifacts-8/filestore/commons-logging/commons-logging/1.1.1/jar/5043bfebc3db072ed80fbd362e7caf00e885d8ae/commons-logging-1.1.1.jar:
~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-core/3.1.1.RELEASE/jar/419e9233c8d55f64a0c524bb94c3ba87e51e7d95/spring-core-3.1.1.RELEASE.jar:
~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-beans/3.1.1.RELEASE/jar/83d0e5adc98714783f0fb7d8a5e97ef4cf08da49/spring-beans-3.1.1.RELEASE.jar:
~/.gradle/caches/artifacts-8/filestore/aopalliance/aopalliance/1.0/jar/235ba8b489512805ac13a8f9ea77a1ca5ebe3e8/aopalliance-1.0.jar:
~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-aop/3.1.1.RELEASE/jar/3c86058cdaea30df35e4b951a615e09eb07da589/spring-aop-3.1.1.RELEASE.jar:
~/.gradle/caches/artifacts-8/filestore/org.springframework/spring-expression/3.1.1.RELEASE/jar/1486d7787ec4ff8da8cbf8752d30e4c808412b3f/spring-expression-3.1.1.RELEASE.jar

Module Dependencies as FileCollections

The most convenient way to illustrate converting a FileCollection to a path-like string is to use a collection of module dependencies, as shown in converting to a path string. It’s worth pointing out explicitly that dependency configurations are themselves FileCollections. When dependencies are listed in the dependencies { } section of the build, they are always assigned to a named configuration. Configurations are themselves defined in the configurations { } section of the build.[3]

As a teacher and frequent conference presenter on Gradle, sometimes I want to enable students to build Java code at times when there is no reliable internet connection. (Many US hotels and conference centers are still woefully unprepared for a few hundred software developers, each with two or three devices on the wireless network and a seemingly insatiable appetite for bandwidth.) While I strongly prefer dependencies to be managed by my build tool, it might make sense for me to prepare lab materials with all of the dependencies statically located in the project in the style of old Ant builds.[4] For some Java frameworks and APIs, chasing all of these JARs down by hand can be a burden. By using module dependencies as file collections, we can automate this work.

If we add the following trivial copy task to Example 1-19, we’ll find the lib directory quickly populated with all of the JARs we need after we run the task. Adding new dependencies to the project later will require only that we re-run the task, and the new JARs will appear as expected. Note that the from method of the Copy task configuration block, which took a directory name as a string in previous examples, can also take a FileCollection as shown in Example 1-22.

Example 1-22. Using module dependencies as a FileCollection to capture JAR files.

task copyDependencies(type: Copy) {
  from configurations.compile
  into 'lib'
}

Adding and Subtracting FileCollections

FileCollections can also be added and subtracted using the + and - operators. We can derive a set of examples by using dependency configurations as file collections.

Our example build for this section is a command-line application using the Spring framework. Spring brings with it a half dozen or so JARs in the minimal case, which would be fairly painful to provide on the command line every time we wanted to run the application. The JavaExec task provides us with a convenient way to solve the problem, as long as we can tell it the class to run and the classpath it should use when launching the new Java Virtual Machine (Example 1-23).

The classpath we want has two components: all of the compile-time dependencies of the project[5] plus the classes compiled from the main Java sources of the project. The former are available in the configurations.compile collection, and the latter in sourceSets.main.output. We will explore sourceSet collections more in the next section.

Example 1-23. Using FileCollection addition to create a runtime classpath

task run(type: JavaExec) {
  main = 'org.gradle.example.PoetryEmitter'
  classpath = configurations.compile + sourceSets.main.output
}

Subtracting one file collection from another creates a new collection containing all of the files in the left-hand collection that are not also in the right-hand collection. To create a collection of all of the text resources in our example build that are not Shelley poetry, we might add the code in Example 1-24 to our build.

Example 1-24. Creating an intersection of two FileCollections

def poems = fileTree(dir: 'src/main/resources', include: '*.txt')
def romantics = fileTree(dir: 'src/main/resources', include: 'shelley*')
def goodPoems = poems - romantics

Printing out the files property of goodPoems (or otherwise inspecting the contents of the collection) shows that it contains all of the .txt files in the src/main/resources directory, but not the file whose name starts with shelley. In a practical build, this case might be accomplished with an excludes property, but more subtle intersections of FileCollections are also possible, such as subtracting container-provided JARs from the set of dependencies packaged up by a WAR task when building a Java web application.

SourceSets as FileCollections

In earlier examples, we used the fileTree() method to create a file collection of all of the source files in a project. It turns out that Gradle gives us a much cleaner way to get this same job done, using the same interface as we’ve been using all along. SourceSets are the domain objects Gradle uses to represent collections of source files, and they happen to expose source code inputs and compiled outputs as FileCollections.

The allSource property of a SourceSet object returns a file collection containing all source inputs and, in the case of source sets compiled by the Java plug-in, all resource files as well. In our example build, inspecting the property would yield the results in Example 1-25.

Example 1-25. Printing out the collection of source files in the main Java SourceSet

println sourceSets.main.allSource.files
[~/file-collection-lab/src/main/resources/application-context.xml,
~/file-collection-lab/src/main/resources/chesterton.txt,
~/file-collection-lab/src/main/resources/henley.txt,
~/file-collection-lab/src/main/resources/shakespeare.txt,
~/file-collection-lab/src/main/resources/shelley.txt,
~/file-collection-lab/src/main/java/org/gradle/example/ConsoleContentSink.java,
~/file-collection-lab/src/main/java/org/gradle/example/Content.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentFactory.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentRegistry.java,
~/file-collection-lab/src/main/java/org/gradle/example/ContentSink.java,
~/file-collection-lab/src/main/java/org/gradle/example/
  DefaultContentFactory.java,
~/file-collection-lab/src/main/java/org/gradle/example/DonneContent.java,
~/file-collection-lab/src/main/java/org/gradle/example/PoetryEmitter.java,
~/file-collection-lab/src/main/java/org/gradle/example/ShakespeareContent.java]

Likewise, the build outputs are provided in the outputs property. The outputs are not given as an exhaustive list of all of the files generated by the compiler—this would require that we run the compiler before interpreting the source set—but instead as a list of the directories into which compiled outputs will be placed (Example 1-26).

Example 1-26. Printing out the output directories the Java compiler will use for the main source set.

println sourceSets.main.output.files
[~/file-collection-lab/build/classes/main,
~/file-collection-lab/build/resources/main]

Lazy Files

When programming with file collections, it might be tempting to think of them as static lists. For example, a call to fileTree() might scan the filesystem at the time it is called, producing an immutable list that the build can then manipulate. Immutable data structures have their uses, but in the Gradle lifecycle, this would make file collections difficult to use. As a result, instances of the FileCollection interface are lazily evaluated whenever it is meaningful to do so.

For example, a task could use the fileTree() method to create a collection of all of the files in the build/main/classes directory that match the glob pattern **/*Test.class. If that file collection is created during the configuration phase (which is likely),[6] the files it is attempting to find may not exist until deep into the execution phase. Hence, file collections are designed to be static descriptions of the collection semantics, and the actual contents of the collection are not materialized until they are needed at a particular time in the build execution.

Conclusion

In this chapter, we’ve looked at Gradle’s comprehensive support for file operations. We explored copy tasks, seeing their ability to move files around in trivial and non-trivial ways, performing various kinds of pattern-matching on files, and even filtering file contents during copy. We looked at keyword expansion and line-by-line filtering of file contents during copy, and also at renaming files as they’re copied—something that often comes in handy when modifying the contents of copied files. We reviewed the three important methods Gradle developers use to deal with files, and finally learned about the all-important FileCollection interface that describes so many important Gradle domain objects. Gradle doesn’t leave you with the bare semantics of the Java File object, but gives you a rich set of APIs to do the kinds of file programming you’re likely to do as you create custom builds and automation pipelines in your next-generation builds.



[1] This is an implementation detail not specified by the documentation for java.io.File. It is common behavior, but even this much cannot be assumed between environments.

[2] There are actually several subtypes of FileCollection that may be in use in any of these cases. The behavior of these supertypes may be important in some cases, but we’ll confine our attention to the common supertype here.

[3] The most commonly used dependency configurations are compile and runtime, which are defined by the Java plug-in, and not explicitly inside a configurations block. For a full treatment of configurations and dependencies, see Chapter 4.

[4] You may prefer this scheme over repositories like Maven Central on principal. Gradle does not force you to use one mechanism of the other.

[5] Compile-time dependencies are, by definition, run-time dependencies as well.

[6] Gradle builds have three phases: initialization, configuration, and execution. Configuration-time build code sets up the task graph for Gradle to process. Actual build activity like copying, compiling, and archiving takes place during the execution phase.

Get Gradle Beyond the Basics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.