The NIO File API

We are now going to turn our attention from the original, “classic” Java File API to the new, NIO, File API introduced with Java 7. As we mentioned earlier, the NIO File API can be thought of as either a replacement for or a complement to the classic API. Included in the NIO package, the new API is nominally part of an effort to move Java toward a higher performance and more flexible style of I/O supporting selectable and asynchronously interruptable channels. However, in the context of working with files, the new API’s strength is that it provides a fuller abstraction of the filesystem in Java.

In addition to better support for existing, real world, filesystem types—including for the first time the ability to copy and move files, manage links, and get detailed file attributes like owners and permissions—the new File API allows entirely new types of filesystems to be implemented directly in Java. The best example of this is the new ZIP filesystem provider that makes it possible to “mount” a ZIP archive file as a filesystem and work with the files within it directly using the standard APIs, just like any other filesystem. Additionally, the NIO File package provides some utilities that would have saved Java developers a lot of repeated code over the years, including directory tree change monitoring, filesystem traversal (a visitor pattern), filename “globbing,” and convenience methods to read entire files directly into memory.

We’ll cover the basic File API in this section and return to the NIO API again at the end of the chapter when we cover the full details of NIO buffers and channels. In particular, we’ll talk about ByteChannels and FileChannel, which you can think of as alternate, buffer-oriented streams for reading and writing files and other types of data.

FileSystem and Path

The main players in the java.nio.file package are: the FileSystem, which represents an underlying storage mechanism and serves as a factory for Path objects; the Path, which represents a file or directory within the filesystem; and the Files utility, which contains a rich set of static methods for manipulating Path objects to perform all of the basic file operations analogous to the classic API.

The FileSystems (plural) class is our starting point. It is a factory for a FileSystem object:

// The default host computer filesystem
FileSystem fs = FileSystems.getDefault();

// A custom filesystem
URI zipURI = URI.create("jar:file:/Users/pat/tmp/MyArchive.zip");
FileSystem zipfs = FileSystems.newFileSystem( zipURI, env ) );

As shown in this snippet, often we’ll simply ask for the default filesystem to manipulate files in the host computer’s environment, as with the classic API. But the FileSystems class can also construct a FileSystem by taking a URI (a special identifier) that references a custom filesystem type. We’ll show an example of working with the ZIP filesystem provider later in this chapter when we discuss data compression.

FileSystem implements Closeable and when a FileSystem is closed, all open file channels and other streaming objects associated with it are closed as well. Attempting to read or write to those channels will throw an exception at that point. Note that the default filesystem (associated with the host computer) cannot be closed.

Once we have a FileSystem, we can use it as a factory for Path objects that represent files or directories. A Path can be constructed using a string representation just like the classic File, and subsequently used with methods of the Files utility to create, read, write, or delete the item.

Path fooPath = fs.getPath( "/tmp/foo.txt" );
OutputStream out = Files.newOutputStream( fooPath );

This example opens an OutputStream to write to the file foo.txt. By default, if the file does not exist, it will be created and if it does exist, it will be truncated (set to zero length) before new data is written—but you can change these results using options. We’ll talk more about Files methods in the next section.

The Path object implements the java.lang.Iterable interface, which can be used to iterate through its literal path components (e.g., the slash separated “tmp” and “foo.txt” in the preceding snippet). Although if you want to traverse the path to find other files or directories, you might be more interested in the DirectoryStream and FileVisitor that we’ll discuss later. Path also implements the java.nio.file.Watchable interface, which allows it to be monitored for changes. We’ll also discuss watching file trees for changes in an upcoming section.

Path has convenience methods for resolving paths relative to a file or directory.

Path patPath =  fs.getPath( "/User/pat/" ); 

Path patTmp = patPath.resolve("tmp" ); // "/User/pat/tmp"

// Same as above, using a Path
Path tmpPath = fs.getPath( "tmp" );
Path patTmp = patPath.resolve( tmpPath ); // "/User/pat/tmp"

// Resolving a given absolute path against any path just yields given path
Path absPath = patPath.resolve( "/tmp" ); // "/tmp"

// Resolve sibling to Pat (same parent)
Path danPath = patPath.resolveSibling( "dan" ); // "/Users/dan"

In this snippet, we’ve shown the Pathresolve() and resolveSibling() methods used to find files or directories relative to a given Path object. The resolve() method is generally used to append a relative path to an existing Path representing a directory. If the argument provided to the resolve() method is an absolute path, it will just yield the absolute path (it acts kind of like the Unix or DOS “cd” command). The resolveSibling() method works the same way, but it is relative to the parent of the target Path; this method is useful for describing the target of a move() operation.

Path to classic file and back

To bridge the old and new APIs, corresponding toPath() and toFile() methods have been provided in java.io.File and java.nio.file.Path, respectively, to convert to the other form. Of course, the only types of Paths that can be produced from File are paths representing files and directories in the default host filesystem.

Path tmpPath = fs.getPath( "/tmp" );
File file = tmpPath.toFile();
File tmpFile = new File( "/tmp" );
Path path = tmpFile.toPath();

NIO File Operations

Once we have a Path, we can operate on it with static methods of the Files utility to create the path as a file or directory, read and write to it, and interrogate and set its properties. We’ll list the bulk of them and then discuss some of the more important ones as we proceed.

The following table summarizes these methods of the java.nio.file.Files class. As you might expect, because the Files class handles all types of file operations, it contains a large number of methods. To make the table more readable, we have elided overloaded forms of the same method (those taking different kinds of arguments) and grouped corresponding and related types of methods together.

Table 12-2. NIO Files methods

MethodReturn typeDescription
copy()long or PathCopy a stream to a file path, file path to stream, or path to path. Returns the number of bytes copied or the target Path. A target file may optionally be replaced if it exists (the default is to fail if the target exists). Copying a directory results in an empty directory at the target (the contents are not copied). Copying a symbolic link copies the linked files data (producing a regular file copy).
createDirectory(), createDirectories()PathCreate a single directory or all directories in a specified path. createDirectory() throws an exception if the directory already exists, whereas createDirectories() will ignore existing directories and only create as needed.
createFile()PathCreates an empty file. The operation is atomic and will only succeed if the file does not exist. (This property can be used to create flag files to guard resources, etc.)
createTempDirectory(), createTempFile()PathCreate a temporary, guaranteed, uniquely named directory or file with the specified prefix. Optionally place it in the system default temp directory.
delete(), deleteIfExists()voidDelete a file or an empty directory. deleteIfExists() will not throw an exception if the file does not exist.
exists(), notExists()booleanDetermine whether the file exists (notExists() simply returns the opposite). Optionally specify whether links should be followed (by default they are).
exists(), isDirectory(), isExecutable(), isHidden(), isReadable(), isRegularFile(), isWriteable()booleanTests basic file features: whether the path exists, is a directory, and other basic attributes.
createLink(), createSymbolicLink(), isSymbolicLink(), readSymbolicLink(), createLink()boolean or PathCreate a hard or symbolic link, test to see if a file is a symbolic link, or read the target file pointed to by the symbolic link. Symbolic links are files that reference other files. Regular (“hard”) links are low-level mirrors of a file where two filenames point to the same underlying data. If you don’t know which to use, use a symbolic link.
getAttribute(), setAttribute(), getFileAttributeView(), readAttributes()Object, Map, or FileAttributeViewGet or set filesystem-specific file attributes such as access and update times, detailed permissions, and owner information using implementation-specific names.
getFileStore()FileStoreGet a FileStore object that represents the device, volume, or other type of partition of the filesystem on which the path resides.
getLastModifiedTime(), setLastModifiedTime()FileTime or PathGet or set the last modified time of a file or directory.
getOwner(), setOwner()UserPrincipalGet or set a UserPrincipal object representing the owner of the file. Use toString() or getName() to get a string representation of the user name.
getPosixFilePermissions(), setPosixFilePermissions()Set or PathGet or set the full POSIX user-group-other style read and write permissions for the path as a Set of PosixFilePermission enum values.
isSameFile()booleanTest to see whether the two paths reference the same file (which may potentially be true even if the paths are not identical).
move()PathMove a file or directory by renaming or copying it, optionally specifying whether to replace any existing target. Rename will be used unless a copy is required to move a file across file stores or filesystems. Directories can be moved using this method only if the simple rename is possible or if the directory is empty. If a directory move requires copying files across file stores or filesystems, the method throws an IOException. (In this case, you must copy the files yourself. See walkFileTree().)
newBufferedReader(), newBufferedWriter()BufferedReader or BufferedWriterOpen a file for reading via a BufferedReader, or create and open a file for writing via a BufferedWriter. In both cases, a character encoding is specified.
newByteChannel()SeekableByteChannelCreate a new file or open an existing file as a seekable byte channel. (See the full discussion of NIO later in this chapter.) Consider using FileChannelopen() as an alternative.
newDirectoryStream()DirectoryStreamReturn a DirectoryStream for iterating over a directory hierarchy. Optionally, supply a glob pattern or filter object to match files.
newInputStream(), newOutputStream()InputStream or OutputStreamOpen a file for reading via an InputStream or create and open a file for writing via an OuputStream. Optionally, specify file truncation for the output stream; the default is to create a truncate on write.
probeContentType()StringReturns the MIME type of the file if it can be determined by installed FileTypeDetector services or null if unknown.
readAllBytes(), readAllLines()byte[] or List<String>Read all data from the file as a byte [] or all characters as a list of strings using a specified character encoding.
size()longGet the size in bytes of the file at the specified path.
walkFileTree()PathApply a FileVisitor to the specified directory tree, optionally specifying whether to follow links and a maximum depths of traversal.
write()PathWrite an array of bytes or a collection of strings (with a specified character encoding) to the file at the specified path and close the file, optionally specifying append and truncation behavior. The default is to truncate and write the data.

With the preceding methods, we can fetch input or output streams or buffered readers and writers to a given file. We can also create paths as files and dirctories and iterate through file hierarchies. We’ll discuss directory operations in the next section.

As a reminder, the resolve() and resolveSibling() methods of Path are useful for constructing targets for the copy() and move() operations.

// Move the file /tmp/foo.txt to /tmp/bar.txt
Path foo = fs.getPath("/tmp/foo.txt" );
Files.move( foo, foo.resolveSibling("bar.txt") );

For quickly reading and writing the contents of files without streaming, we can use the read all and write methods that move byte arrays or strings in and out of files in a single operation. These are very convenient for files that easily fit into memory.

// Read and write collection of String (e.g. lines of text)
Charset asciiCharset = Charset.forName("US-ASCII");
List<String> csvData = Files.readAllLines( csvPath, asciiCharset );
Files.write( newCSVPath, csvData, asciiCharset );

// Read and write bytes
byte [] data = Files.readAllBytes( dataPath );
Files.write( newDataPath, data );

Directory Operations

In addition to basic directory creation and manipulation methods of the Files class, there are methods for listing the files within a given directory and traversing all files and directories in a directory tree. To list the files in a single directory, we can use one of the newDirectoryStream() methods, which returns an iterable DirectoryStream.

// Print the files and directories in /tmp
try ( DirectoryStream<Path> paths = Files.newDirectoryStream(
    fs.getPath( "/tmp" ) ) ) {

    for ( Path path : paths ) { System.out.println( path ); }
}

The snippet lists the entries in “/tmp,” iterating over the directory stream to print the results. Note that we open the DirectoryStream within a try-with-resources clause so that it is automatically closed for us. A DirectoryStream is implemented as a kind of one-way iterable that is analogous to a stream, and it must be closed to free up associated resources. The order in which the entries are returned is not defined by the API and you may need to store and sort them if ordering is required.

Another form of newDirectoryStream() takes a glob pattern to limit the files matched in the listing:

// Only files in /tmp matching "*.txt" (globbing)
try ( DirectoryStream<Path> paths = Files.newDirectoryStream( 
    fs.getPath( "/tmp" ), "*.txt" ) ) { 
    ...

File globbing filters filenames using the familiar “*” and a few other patterns to specify matching names. Table 12-3 provides some additional examples of file globbing patterns.

Table 12-3. File globbing pattern examples

PatternExample
*.txtFilenames ending in “.txt”
*.{java,class}Filenames ending in “java” or “class”
[a,b,c]*Filenames starting with “a”, “b”, or “c”
[0-9]*Filenames starting with the digits 0 through 9
[!0-9]*Filenames starting with any character except 0 through 9
pass?.datFilenames starting with “pass” plus any character plus “.dat” (e.g., pass1.dat, passN.dat)

If globbing patterns are not sufficient, we can provide our own stream filter by implementing the DirectoryStream.Filter interface. The following snippet is the procedural (code) version of the “*.txt” glob pattern; matching filenames ending with “.txt”. We’ve implemented the filter as an anonymous inner class here because it’s short:

// Same as above using our own (anonymous) filter implementation
try ( DirectoryStream<Path> paths = Files.newDirectoryStream(
    fs.getPath( "/tmp" ),
    new DirectoryStream.Filter<Path>() {
        @Override
        public boolean accept( Path entry ) throws IOException {
            return entry.toString().endsWith( ".txt" );
        }
} ) ) {
    ...

Finally, if we need to iterate through a whole directory hierarchy instead of just a single directory, we can use a FileVisitor. The FileswalkFileTree() method takes a starting path and performs a depth-first traversal of the file hierarchy, giving the provided FileVisitor a chance to “visit” each path element in the tree. The following short snippet prints all file and directory names under the /Users/pat path:

// Visit all of the files in a directory tree
Files.walkFileTree( fs.getPath( "/Users/pat"), new SimpleFileVisitor<Path>() {
    @Override
    public FileVisitResult visitFile( Path file, BasicFileAttributes attrs )
    {
        System.out.println( "path = " + file );
        return FileVisitResult.CONTINUE;
    }
} );

For each entry in the file tree, our visitor’s visitFile() method is invoked with the Path element and attributes as arguments. The visitor can perform any action it likes in relation to the file and then indicate whether or not the traversal should continue by returning one of a set of enumerated result types: FileVisitResultCONTINUE or TERMINATE. Here we have subclassed the SimpleFileVisitor, which is a convenience class that implements the methods of the FileVisitor interface for us with no-op (empty) bodies, allowing us to override only those of interest. Other methods available include visitFileFailed(), which is called if a file or directory cannot be visited (e.g., due to permissions), and the pair preVisitDirectory() and postVisitDirectory(), which can be used to perform actions before and after a new directory is visited. The preVisitDirectory() has additional usefulness in that it is allowed to return the value SKIP_SUBTREE to continue the traversal without descending into the target path and SKIP_SIBLINGS value, which indicates that traversal should continue, skipping the remaining entries at the same level as the target path.

As you can see, the file listing and traversal methods of the NIO File package are much more sophisticated than those of the classic java.io API and are a welcome addition.

Watching Paths

One of the nicest features of the NIO File API is the WatchService, which can monitor a Path for changes to any file or directory in the hierarchy. We can choose to receive events when files or directories are added, modified, or deleted. The following snippet watches for changes under the folder /Users/pat:

Path watchPath = fs.getPath("/Users/pat");
WatchService watchService = fs.newWatchService();
watchPath.register( watchService, ENTRY_CREATE, ENTRY_MODIFY, ENTRY_DELETE );

while( true )
{
    WatchKey changeKey = watchService.take();
    List<WatchEvent<?>> watchEvents = changeKey.pollEvents();
    for ( WatchEvent<?> watchEvent : watchEvents )
    {
        // Ours are all Path type events:
        WatchEvent<Path> pathEvent = (WatchEvent<Path>)watchEvent;

        Path path = pathEvent.context();
        WatchEvent.Kind<Path> eventKind = pathEvent.kind();
        System.out.println( eventKind + " for path: " + path );
    }

    changeKey.reset(); // Important!
}

We construct a WatchService from a FileSystem using the newWatchService() call. Thereafter, we can register a Watchable object with the service (currently, Path is the only type of Watchable) and poll it for events. As shown, in actuality the API is the other way around and we call the watchable object’s register() method, passing it the watch service and a variable length argument list of enumerated values representing the event types of interest: ENTRY_CREATE, ENTRY_MODIFY, or ENTRY_DELETE. One additonal type, OVERFLOW, can be registered in order to get events that indicate when the host implementation has been too slow to process all changes and some changes may have been lost.

After we are set up, we can poll for changes using the watch service take() method, which returns a WatchKey object. The take() method blocks until an event occurs; another form, poll(), is nonblocking. When we have a WatchKey containing events, we can retrieve them with the pollEvents() method. The API is, again, a bit awkward here as WatchEvent is a generic type parameterized on the kind of Watchable object. In our case, the only types possible are Path type events and so we cast as needed. The type of event (create, modify, delete) is indicated by the WatchEventkind() method and the changed path is indicated by the context() method. Finally, it’s important that we call reset() on the WatchKey object in order to clear the events and be able to receive further updates.

Performance of the WatchService depends greatly on implementation. On many systems, filesystem monitoring is built into the operating system and we can get change events almost instantly. But in many cases, Java may fall back on its generic, background thread-based implementation of the watch service, which is very slow to detect changes. At the time of this writing, for example, Java 7 on Mac OS X does not take advantage of the OS-level file monitoring and instead uses the slow, generic polling service.

Get Learning Java, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.