Index Locking and Concurrency Issues

This section deals with one of the most confusing issues for users new to Ferret: index locking. Ferret was designed to be used in a multiprocess environment, so it comes with a built-in index locking mechanism. Basically, you can’t have two processes modifying the same index at the same time. And just because you are working in a single-process, single-threaded environment, it doesn’t mean you can forget about index locking. Let’s say, for example, that you want to delete a set of documents in the index by their document numbers, but you also have an IndexWriter open for adding documents to the index. You will need to close the IndexWriter before committing the deletions and then reopen the IndexWriter. Why, you ask? To answer this, you have to understand how index locking works.

The Ferret index currently uses two locks: a commit lock and a write lock. You must know which operations use which one of these locks and when. There are two classes that can obtain these locks: IndexWriter and IndexReader. The IndexWriter obtains the write lock as soon as it is opened and keeps it until it is closed. (Hence, the importance of closing IndexWriters when you have finished with them.) As a result, you can only ever have one IndexWriter open on an index at any time, and you can’t perform any write operations with an IndexReader while there is an IndexWriter open on the index. IndexWriter will also obtain the commit lock when you optimize, commit, ...

Get Ferret now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.