SolrCloud – What happens when ZooKeeper fails?

One of the questions I tend to get is what happens with SolrCloud cluster when ZooKeeper fails. Of course we are not talking about a single ZooKeeper instance failure, but the whole ensemble not being accessible and so the quorum not present. Because the answer to this question is very easy to verify i decided to make a simple blog post to show what happens when ZooKeeper fails.

Test environment

The test environment was very simple:

  • A single virtual machine running under Linux operating system
  • A single instance of ZooKeeper (which will be suitable for our test)
  • Two Solr instances with a single collection deployed
  • Solr 4.6

In order to create our test collection I’ve uploaded the configuration to ZooKeeper and used the following command:

The cloud view of the example cluster was as follows:


Test data indexing

The next step in our test will be indexing. We will index a few example documents that are provided with Solr in the exampledocs directory. The following commands were used to index the data:

After executing the above commands we get the following number of documents:

  • The whole collection holds 5 documents
  • Shard located on Solr running on port 8983 host 1 document
  • Shard located on Solr running on port 7983 has 4 documents

Querying with ZooKeeper not present

Now we go to the next step – we shutdown our ZooKeeper instance and we try to run a simple query by sending the following command:

In result we get the following response:

As we can see Solr responded correctly. This is because Solr already has the clusterstate.json file cached. To search Solr doesn’t need to update that file, so search should and is working as we could see.

Indexing with failed ZooKeeper

Without turning on our ZooKeeper instance we try to run the following command:

The above command should result in indexing the contents of the hd.xml file. After a longer period of time Solr responds with the following information:

So as you can see we are not able to index data without working ZooKeeper ensemble.

Starting ZooKeeper again

So let’s see what will happen when we start our ZooKeeper instance again without restarting Solr nodes. After starting ZooKeeper we try to run the same indexing command, we just did, once again:

And this time the response is different:

As we can see the indexing request was successful this time. This allows us to assume that the connection to ZooKeeper was re-established by Solr. We can see that in Solr and ZooKeeper logs.

Short summary

As you can see, our short test allowed to see what happens when our ZooKeeper ensemble fails and what we can expect from Solr in such rare cases. I hope this blog entry will help you with some doubts about SolrCloud and its usefulnesses.

Please also remember that during the test, the cluster state did not change – all shards were accessible and working. We will see what will be happening when shards or replicas fails when ZooKeeeper is down in the next blog entry about SolrCloud.

One thought on “SolrCloud – What happens when ZooKeeper fails?

  • 24 August 2018 at 14:06

    I have question like say if zookeeper is down if we update the Solr document so in that case if we search that updated document
    then what is the behavior of Solr?

    Is it return 200 Ok or something else?


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.