What is the most important change in 4.x version of Apache Solr? I think there are many of them but Solr Cloud is definitely something that changed a lot in Solr architecture. Until now, bigger installations suffered from single point of failure (SPOF) – there was only the one master server and when this server was going down, the whole cluster lose the ability to receive new data. Of course you could go for multiple masters, where a single master was responsible for indexing some part of the data, but still, there was a SPOF present in your deployment. Even if everything worked, due to commit interval and the fact that slave instances checked the presence of new data periodically, the solution was far from ideal – the new data in the cluster appeared minutes after commit.
Solr Cloud changed this behavior. In this article we will setup a new SolrCloud cluster from the scratch and we will see how it work.
Our example cluster
In our example we will use three Solr servers. Every server in the cluster is capable of handling both the index and the query requests. This is the main difference from the old-fashioned Solr architecture with single master and multiple slave servers. In the new architecture there is one additional element present: Zookeeper, which is responsible for holding configuration of the cluster and for synchronization of its work. It is crucial to understand that Solr relies on information stored in Zookeeper – if Zookeeper will fail, the whole cluster is useless. Because of this it is very important to have a fault tolerant Zookeeper ensemble and because of this we use three independent instances of Zookeeper that will form the ensemble.
As we said previously, Zookeeper is a vital part of SolrCloud cluster. Although we can use embedded Zookeeper, this is only handy for testing. For production you definitely want your Zookeeper to be installed independently from Solr and run in a different Java virtual machine process to avoid those two interrupting each other and influencing each others work.
The installation of Apache Zookeeper is straight forward and may be described by the following steps:
- Download Zookeeper archive from: http://www.apache.org/dyn/closer.cgi/zookeeper/
- Unpack downloaded archive and copy conf/zoo_sample.cfg to conf/zoo.cfg
- Modify zoo.cfg:
- Change dataDir to directory where you want to hold all cluster configuration data
- Add information about all Zookeeper servers (see below)
After mentioned changes my zoo.cfg looks like the following one:
- Copy this archive to the all servers, where Zookeeper service should be run
- Create file /var/zookeeper/data/myid with server identifier. This identifier is different for each instance (for example on zk2 this file should contain 2 number)
- Start all instances using “bin/zkServer.sh start-foreground” and verify validity of the installation
- Add “bin/zkServer.sh start” to starting scripts and make sure that operation system monitors that Zookeeper service is available.
The installation of Solr is the following:
- Download Solr archive from: http://www.apache.org/dyn/closer.cgi/lucene/solr/4.1.0
- Unpack downloaded archive
- In this tutorial we will use the ready Solr installation from the example directory and all changes are made to this example installation
- Copy archive to all servers which are the part of the cluster
- Install to Zookeeper configuration data, which will be used by the Solr cluster. For this run the first instance with:
1java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=solr1 -DzkHost=zk1:2181 -DnumShards=2 -jar start.jar
This should be run only once. The next run will use configuration from Zookeeper cluster and local configuration files are not needed.
- Run all instances using
1java –DzkHost=zk1:2181 –jar start.jar
Verify the installation
Go into administration panel on any Solr instance. For our deployment the URL should be like http://solr1:8983/solr. When you click on cloud tab, and graph, you should see something similar to the following screen shot:
Our first collection – the collection1 is divided into two shards (shard1 and shard2). Each of those shards is placed on two Solr instances (OK, on the picture you see that every Solr is placed on the same host – I have currently only one physical server available for tests – any volunteers for donation? ;)). You can see that type of the dot tell us if it is a primary shard or replica.
I hope this is the first note about solrCloud. I know it is very short and skips details and information about shards, replicas and architecture of this solution. Treat this as a simple checklist for basic, (but real) configuration of your cloud.