Treating Files as Streams

We’ve seen that reading the entire contents of a file isn’t always the best solution. For a start, it forces us to keep the entire contents of the file in memory. This might merely be wasteful with smaller files, but it can turn out to be plain impossible with much larger ones. Imagine wanting to process a 50GB file on a computer that has only 4GB of memory; it would be impossible for us to read the entire file at once.

The solution is to treat the file as a stream. Instead of reading from the beginning of the file to the end in one go, and keeping all of that information in memory, we read only a small amount at a time. We might read the first line, for example, then discard it and move onto the second, then discard ...

Get Text Processing with Ruby now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.