Chapter 5. Administration Tips

Tip #39: Manually clean up your chunks collections

GridFS keeps file contents in a collection of chunks, called fs.chunks by default. Each document in the files collection points to one or more document in the chunks collection. It’s good to check every once and a while and make sure that there are no “orphan” chunks—chunks floating around with no link to a file. This could occur if the database was shut down in the middle of saving a file (the fs.files document is written after the chunks).

To check over your chunks collection, choose a time when there’s little traffic (as you’ll be loading a lot of data into memory) and run something like:

> var cursor = db.fs.chunks.find({}, {"_id" : 1, "files_id" : 1});
> while (cursor.hasNext()) {
... var chunk = cursor.next();
... if (db.fs.files.findOne({_id : chunk.files_id}) == null) {
...    print("orphaned chunk: " + chunk._id);
... }

This will print out the _ids for all orphaned chunks.

Now, before you go through and delete all of the orphaned chunks, make sure that they are not parts of files that are currently being written! You should check db.currentOp() and the fs.files collection for recent uploadDates.

Tip #40: Compact databases with repair

In Tip #31: Do not depend on repair to recover data, we cover why you usually shouldn’t use repair to actually repair your data (unless you’re in dire straits). However, repair can be used to compact databases.

Note

Hopefully this tip will become irrelevant ...

Get 50 Tips and Tricks for MongoDB Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.