Backing Up Your Index

Did you ever wonder if you can create a backup of your index with the tools available in Solr ? For exmaple after every commit or optimize operation ? Or may you would like to create backups with the HTTP API call ? Lets see what possibilities Solr has to offer.

The Beginning

We decided to write about index backups even though this functionality is fairly simple. We noticed that many people tend to forget about this functionality, not only when it comes to Apache Solr. We hope that this blog entry, will help you remember about backup creation functionality, when you need it. But now, lets start from the beginning – before we started the tests, we looked at the directory where Solr keeps its indices and this is what we saw:

Manual Backup

In order to create a backup of your index with the use of HTTP API you have to have replication handler configured. If you have it, then you need to send the command parameter with backup value to the master server replication handler, for example like this:

The above will tell Solr to create a new backup of the current index. Lets now look how the directory where indices live looks like after running the above command:

As you can see, there is a new directory created – snapshot.20120812201917. We can assume, that we got what we wanted 🙂

Automatic Backup

In addition to manual backup creation, you can also configure Solr to create indices after commit or optimize operation. Please remember though, that if your index is changing rapidly it is usually a bad idea to create backup after each commit operation.  But lets get back to automatic backups. In order to configure Solr to create backups for us, you need to add the following line to replication handler configuration:

So, the full replication handler configuration (on the master server) would look like this:

After sending two commit operation our dictionary with indices looks like this:

As you can see, Solr did what we wanted to be done.

Keeping Order

It is possible to control the maximum amount of backups that should be stored on disk. In order to configure that number you need to add the following line to your replication handler configuration:

The above configuration value tells Solr to keep maximum of ten backups of your index. Of course you can delete created backups (manually for example) if you don’t need them anymore.

8 thoughts on “Backing Up Your Index

  • 14 August 2012 at 18:58
    Permalink

    Thanks, very helpful! How is the backup restored?

    Reply
  • 27 December 2012 at 14:55
    Permalink

    Awesome article. Very helpfull as always.

    Reply
  • 27 December 2012 at 14:58
    Permalink

    Is is possible to configure backup schedule at certain time interval?

    Reply
    • 1 January 2013 at 20:40
      Permalink

      Actually I don’t know, but its quite easy to run backup command from cron table.

      Reply
  • 29 January 2013 at 23:40
    Permalink

    Thanks, good post.
    Is there a way to determine when manual backup has finished?

    Reply
  • 29 January 2013 at 23:49
    Permalink

    I don’t think so, I’m afraid 🙁

    Reply
  • 8 May 2013 at 21:36
    Permalink

    Unfortunately it’s useless in production.
    You cannot find out whether backup has finished or is still running and even if it has failed or succeeded.
    Moreover another API call during backup schedules another backup and there is no way to cancel it.

    Reply
  • 30 August 2018 at 13:18
    Permalink

    @Viktors, @gr0, @tom

    At least from Solr 4.10.2 onwards (which I’m using) you an query for the current state of your triggered backups by sending the command `details`.

    E.g. `curl ‘http://localhost:8983/solr/replication?command=status’
    ` will return an XML document which includes the following exemplary response:

    “`

    Thu Aug 30 14:12:47 CEST 2018
    31
    success
    Thu Aug 30 14:12:47 CEST 2018


    “`

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.