Solr Optimistic Concurrency Unlocked!

If you have multiple clients updating documents, it's really critical to ensure that the newer version of the document is never overwritten by the older version. To address this problem, what you need is concurrency control, which is the process of managing simultaneous updating of documents.

There are two approaches to handle the concurrency problem: Pessimistic and Optimistic. As the name suggests, Pessimistic approach is very pessimistic. It believes that the problem can be quite frequent and hence locks the document during transaction and any subsequent requests until the transaction is completed, have to wait or are declined. If your documents are transactional, then RDBMS is the way to go.

Pessimistic locking has inherent disadvantages, overheads and is time-consuming. So what Solr and perhaps other NoSQL take is the optimistic approach because they believe that conflicts can occur but hope those to be very rare. Hence, they don't lock the document. Instead they record all the updated operations and if they find that two users are trying to update the same document simultaneously, then one of the requests is discarded and that user gets an error message.

A read request for the document doesn't bother about concurrency and is given the relatively latest document but can be a bit out-of-date at times.

Optimistic concurrency generally happens in the 3 phases of READ-VALIDATE-WRITE. Let's see how it's implemented in Solr.

Solr implements optimistic concurrency using field _version_, which is added in each document and is by default provided in schema.xml. Remember, field names starting and ending with underscore are reserved in Solr and can have a special meaning. So never try to create a field with name like _version_ for some other purpose.

To use optimistic concurrency, you need to provide an additional field _version_ along with your request for updating or removing a document. A sample request can be as follow:

$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
<add>
<doc>
<field name="id">1234</field>
<field name="article">Searching Solr</field>
<field name="abstract">This contains the abstract</field>
<field name="_version_">12345678776878</field>
</doc>
</add>'

You can also provide the version information as a request parameter instead of the field name:

http://localhost:8983/solr/update?_version_=12345678776878

Once Solr receives the update request along with the _version_ information, it will read the existing document with a unique key and match the number in the _version_ field with the _version_ number in the request. The validation will follow the following rules:

_version_ > 1: Both versions should match exactly
_version_ = 1: Document must exist
_version_ < 0: Document must not exist
_version_ = 0: Overwrite

If the validation is successful, the document will be indexed with an updated _version_ greater than the previous one. If the validation fails, Solr will respond with version conflict error code 409.

To use this feature, you always need to provide _version_ information along with the request. If that is not provided, the existing document will be overwritten unless it's an Atomic Update request.