Preface

Many people think of Adobe's Portable Document Format (PDF) as a proprietary format for delivering unchangeable content that readers can print out or view on-screen conveniently. That may be how most people work with it, but you can do many more things with PDF, with or without Adobe's tools.

PDF has come a long way since it first appeared in the early '90s. When Adobe began offering its Acrobat Reader for free, PDF spread across the Web as a paginated alternative to HTML. PDF has replaced or supplemented Adobe's PostScript language files as a format for exchanging print-ready layouts, and evolving forms capabilities have made PDF a more interactive format over time.

Although most people still think of Acrobat when they think of PDF, the format has become a standard for other applications as well. Adobe publishes the PDF specification, so developers can create their own tools for creating and consuming PDF. Ghostscript software, for example, is an open source toolkit for working with PostScript and PDF. OpenOffice.org enables users to create PDF files from its applications, and Apple has integrated PDF tightly with Mac OS X, including its own PDF reader and tools for printing to PDF from any application.

Many people treat PDF documents as finished products, simply reading them or printing them out, but you can create and modify PDFs in many ways to meet your needs. Adobe's Acrobat family of products, beyond the Acrobat Reader, includes a variety of tools for creating and changing PDFs, but there are lots of other helpful tools and products for working with PDF, many of which are covered in this book.

Why PDF Hacks?

The term hacking has a bad reputation in the press. They use it to refer to someone who breaks into systems or wreaks havoc with computers as their weapon. Among people who write code, though, the term hack refers to a "quick-and-dirty" solution to a problem, or a clever way to get something done. And the term hacker is taken very much as a compliment, referring to someone as being creative, having the technical chops to get things done. The Hacks series is an attempt to reclaim the word, document the good ways people are hacking, and pass the hacker ethic of creative participation on to the uninitiated. Seeing how others approach systems and problems is often the quickest way to learn about a new technology.

PDF has traditionally been seen as a pretty unhackable technology. Most people work with PDF using tools provided by a single vendor, Adobe, and PDFs are often distributed under the assumption that people can't (or at least won't) modify them. In practice, however, PDF tools offer an enormous amount of flexibility and support a wide range of ways to read, share, manage, and create PDF files. Even if you only read PDF files, there are lots of ways to improve your reading experience, many of which are not obvious. Creators of PDF files can similarly do much more than just "print to PDF"; they can generate files with custom content or create forms for two-way communications.

PDF Hacks shows you PDF's rich possibilities and helping you to use it in new ways.

How to Use This Book

You can read this book from cover to cover if you like, but each hack stands on its own, so feel free to browse and jump to the different sections that interest you most. If there's a prerequisite you need to know about, a cross-reference will guide you to the right hack. If you're looking for something specific, the index might help you as well.

A Note on Software

Although PDF still is closely associated with Adobe's Acrobat family of tools, you don't always need Acrobat to do useful work. And even though many of the hacks are specific to particular commercial tools (Acrobat 5, Acrobat 6 Standard, or Acrobat 6 Professional) or are bound to a particular operating system, overall the book tries to stay as environment-agnostic as possible. Whether you're running Windows, Mac OS X, or Linux, there should be a way to do most of the things described here. Some hacks are specific to a particular operating system, in which case they will say so.

Using Code Examples

This book is here to help you get your job done. In general, you can use the code in this book in your programs and documentation (all the code is available for download in a zip archive from http://examples.oreilly.com/pdfhks/; most of the hacks assume these example files are in place in a working directory). You do not need to contact us for permission unless you're reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. However, selling or distributing a CD-ROM of examples from this book does require permission. Answering a question by citing this book and quoting an example does not require permission, but incorporating a significant number of examples from this book into your product's documentation does require permission.

We appreciate, but do not require, attribution when using code. An attribution usually includes the title, author, publisher, and ISBN. For example: "PDF Hacks by Sid Steward. Copyright 2004 O'Reilly Media, Inc., 0-596-00655-1."

If you feel your use of code examples falls outside fair use or the permission given here, feel free to contact us at .

How This Book Is Organized

This book is divided into seven chapters, each of which is described briefly here:

Chapter 1 , Consuming PDF

This chapter discusses various tools for reading PDF files and teaches you how to make these tools more convenient to use. It also describes ways in which you can get the information you want out of Acrobat and into other applications.

Chapter 2 , Managing a Collection

Reading and working with individual PDF files often leads to having a collection of files. This chapter provides tools and techniques for keeping track of what's in all those files and for presenting them to users looking for information.

Chapter 3 , Authoring and Self-Publishing: Hacking Outside the PDF

Most PDFs aren't created directly as PDFs; they start in other formats and then are converted to PDF. PDFs also feed into a lot of other processes, from printing to e-book distribution. This chapter examines techniques for creating rich sources of PDF content and looks at things you can do with PDF files outside of the usual viewing and printing contexts.

Chapter 4 , Creating PDF and Other Editions

There are lots of different ways to create PDF files and useful ways to supplement your PDFs with the same information in different formats. This chapter looks at a variety of tools and techniques you can use to create your own PDFs.

Chapter 5 , Manipulating PDF Files

Once you have PDF files, you might want to do more to them. This chapter shows you how to perform such techniques as splitting PDF files, encrypting documents, attaching data, reducing file sizes, building indexes, and working with bookmarks.

Chapter 6 , Dynamic PDF

PDF files don't have to be static representations of documents created once. This chapter shows you how to make PDF itself more active through its forms capabilities and teaches you how to use a variety of different tools to generate PDFs from your data on the fly.

Chapter 7 , Scripting and Programming Acrobat

Adobe's Acrobat family of applications remains at the heart of much PDF creation and processing. This chapter includes techniques for automating common tasks and stretching these applications in new and different ways.

Conventions Used in This Book

The following is a list of typographical conventions used in this book:

Italic

Used to indicate new terms, URLs, filenames, file extensions, directories, and program names, and to highlight comments in examples. For instance, a path in the filesystem will appear as C:\Hacks\examples or /usr/sid/hacks/examples.

Constant width

Used to show code examples, XML markup, Java© package or C# namespace names, commands and options, or output from commands.

Constant width bold

Used in examples to show emphasis or commands and other text that should be typed literally.

Constant width italic

Used in examples and tables to show text that should be replaced with user-supplied values.

You should pay special attention to notes set apart from the text with the following icons:

Tip

This is a tip, a suggestion, or a general note. It contains useful supplementary information about the topic at hand.

Warning

This is a warning or a note of caution.

The thermometer icons, found next to each hack, indicate the relative complexity of the hack:

beginner
moderate
expert

How to Contact Us

We have tested and verified the information in this book to the best of our ability, but you might find that some software features have changed over time or even that we have made some mistakes. As a reader, you can help us to improve future editions of this book by sending us your feedback. Let us know about any errors, inaccuracies, bugs, misleading or confusing statements, and typos that you find anywhere in this book.

Also, please let us know what we can do to make this book more useful to you. We take your comments seriously and will try to incorporate reasonable suggestions into future editions. You can write to us at:

O'Reilly Media, Inc.
1005 Gravenstein Hwy. N.
Sebastopol, CA 95472
(800) 998-9938 (in the U.S. or Canada)
(707) 829-0515 (international/local)
(707) 829-0104 (fax)

To ask technical questions or to comment on the book, send email to:

The O'Reilly web site for PDF Hacks offers a zip archive of example files, errata, a place to write reader reviews, and much more. You can find this page at:

http://www.oreilly.com/catalog/pdfhks/

You can also find information about this book at:

http://www.pdfhacks.com

For more information about this and other books, see the O'Reilly web site:

http://www.oreilly.com

Got a Hack?

To explore other books in the Hacks series or to contribute a hack online, visit the O'Reilly hacks web site at:

http://hacks.oreilly.com

Get PDF Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.