Introduction to PyPDF2

One of the modules available in Python to extract data from PDF documents is PyPDF2. The module can be downloaded directly with the pip install utility since it is located in the official Python repository .

In the https://pypi.org/project/PyPDF2/ URL, we can see the last version of this module:

This module offers us the ability to extract document information, and encrypt and decrypt documents. To extract metadata, we can use the PdfFileReader class and the getDocumentInfo() method, which returns a dictionary with the data of the document:

The following function would allow us to obtain the information of all the PDF ...

Get Mastering Python for Networking and Security now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.