Solr 5.2: quick look on Solr backup functionality

With the lastest release of Solr – the 5.2 and 5.2.1 we were given the new API – the backup API based on the replication handler. Because this functionality has been anticipated by some users, we decided to give it a quick look.

In order to test the new functionality we will do a very simple test:

  1. We will launch Solr in the SolrCloud mode,
  2. We will index a few documents,
  3. We will make the backup using the new API,
  4. We will index another few documents,
  5. Finally we will try to restore the backup done in step 3

Let’s start.

Starting Solr

To start Solr in SolrCloud mode, we’ve used the bin/solr script and we’ve used the following command:

For the purpose of the tests we need a single SolrCloud instance, with a single, empty collection (we will use the gettingstarted one provided with Solr) and a single shard.

Our cluster topology looked as follows:

Zrzut ekranu 2015-06-21 o 11.13.31

Data indexation

Indexing data is as simple as starting Solr. Because we are using the example gettingstarted collection we can send documents without defined structure and Solr will adjust the schema.xml to what we need. So, for the purpose of the tests we will index two documents using the following command:


Making a backup is again very simple. We just need to run the following command:

The above command tells Solr, that we want to make a backup of our collection called snapshot.test (Solr will add the value of the name parameter to the snapshot. prefix). The backup itself, will be created in the collection data directory by default – this is when we will not provide the desired directory using the location parameter. In our example, we’ve provided that parameter and use an absolute path to tell Solr where the backup should be placed.

The response from Solr should be fast and look similar to the following one:

Of course, if our collection is large, the time needed to create backup will be significantly larger. We can check the status our the backup creation by running the following command:

Next indexation

The next step of our simple test is another indexation – this time adding two new documents using the following command:

After the above command, if we would run a simple query like the following one:

Solr should respond and inform us that we have four documents in total:

Restoring our backup

Now let’s try restoring our backup and see how many documents we will have after that operation. To restore the bacup we’ve created we run the following command:

If everything went well, Solr response should be similar to the following one:

So let’s now check how many documents are present in our collection by running the following command:

As we can see, the number of documents in the collection is 2, which mean that our backup has been properly restored.

Short summary

As we can see, Solr backup mechanism works flawlessly, however we should remember about few things. When running a few Solr instances on the same physical machine, we should avoid doing backups using absolute paths – we can end up with shards data being overwritten. Apart from that, its good to finally have fully working and easy to use backup functionality 🙂

7 thoughts on “Solr 5.2: quick look on Solr backup functionality

  • 3 February 2016 at 17:31

    hi there, this is great information. I was able to backup and restore a core on a single node.

    I have 3 solr nodes and 3 Zookeeper nodes cluster setup for replication.
    After the backup and restore operations, the nodes do not pick up the restored version of data. In other words the data does not get replicated. Is this a known issue?

    Any thoughts?
    Thank you,

    • 6 February 2016 at 09:24

      Have you tried running the commit command after restoring the backup?

  • 23 February 2017 at 16:18

    I know this backs up the index but does it back up stored (not indexed) fields too?

    • 26 February 2017 at 11:52

      It is a binary copy of all the index files, so it doesn’t matter if the fields are indexed, not indexed, have doc values or anything like that. Solr just copies everything as is.

  • 13 July 2017 at 11:32

    Nice. i am new, need ur guidence. my target is to use solrcloud on 11 nodes , 3 for zoo kpper, remaning 8 for (sharding) indexing and replication.. i want to divide my documets with date and load on speific shard. like date partition. 1 date for 1 shard, how can i perform this, pls guide me

    • 17 July 2017 at 22:22

      For time based data having a collection per date would be a better idea – way more flexible compared to a single collection with multilpe shards which uses routing.

  • 12 November 2017 at 05:20

    Thanks for this article.
    Though http://server:8983/solr/core/replication?command=details

    returns “status” as “success”, it is only after a while the updates to the filesystem are finished (i.e. files are being written to snapshot dir even after success is returned). Sometimes/often the “fileCount” returned in the response to this does not match the number of files in the snapshot dir. So, how does one determine when a backup is really complete?

    Appreciate your response.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.