solrcloud – Solr.pl

SolrCloud and query execution control

Rafał Kuć — Mon, 14 Jan 2019 14:27:53 +0000

With the release of Solr 7.0 and introduction of new replica types, in addition to the defa?ult NRT type the question appeared – can we control the queries and where they are executed? Can we tell Solr to execute the queries only on the PULL replicas or give TLOG replicas a priority? Let’s check that out.

Shards parameter

The first control option that we have in SolrCloud is the shards parameter. Using it we can directly control which shards should be used for querying. For example we can provide a logical shard name in our query:

http://localhost:8983/solr/test/select?q=*:*&shards=shard1

http://localhost:8983/solr/test/select?q=*:*&shards=shard1,shard2,shard3

http://localhost:8983/solr/test/select?q=*:*&shards=localhost:6683/solr/test

The first of the above queries will be executed only on those shards that are grouped under the logical shard1 name. The second query will be executed on logical shard1, shard2 and shard3, while the third query will be executed on the shards that are deployed on the localhost:6683 node on the test collection

There is also a possibility to do load balancing across instances, for example:

http://localhost:8983/solr/test/select?q=*:*&shards=localhost:6683/solr/test|localhost:7783/solr/test

The above query will be executed on instance running on port 6683 or on the one running on port 7783.

Shards.preference parameter

While the shards parameter gives us some degree of control where the query should be executed it is not exactly what we would like to have. However to use a certain type of replica we would have to get the data about the physical layout of the shards and this is not something that we would like to do. Because of that the shards.preference parameter has been introduced to Solr. It allows us to tell Solr what type of replicas should have the priority when executing query.

For example, to tell Solr that PULL type replicas should have priority when the query is executed one should add the shards.preference parameter to the query and set it to replica.type:PULL:

http://localhost:8983/solr/test/select?q=*:*&shards.preference=replica.type:PULL

The nice thing is that we can tell Solr that first PULL replicas should be used and then if they are not available TLOG replicas should be used:

http://localhost:8983/solr/test/select?q=*:*&shards.preference=replica.type:PULL,replica.type:TLOG

We can also define that PULL types replicas should be used first and if they are not available local shards should have the priority:

http://localhost:8983/solr/test/select?q=*:*&shards.preference=replica.type:PULL,replica.location:local

In addition to the above example we can also define priority based on location of the replicas. For example if our 192.168.1.1 Solr node is way more powerful compared to the others and we would like to first prioritize PULL replicas and then the mentioned Solr node we would run the following query:

http://localhost:8983/solr/test/select?q=*:*&shards.preference=replica.type:PULL,replica.location:http://192.168.1.1

Summary

The discussed parameters and the shards.preference in particular with its replica.type value can be very useful when we are using SolrCloud with different types of replicas. Telling Solr that we would like to prefer PULL or TLOG replicas we can lower the query based pressure on the NRT replicas and thus have better performance of the whole cluster. What’s more – dividing the replicas can help us in achieving query performance that is close what Solr master – slave architecture provides without sacrificing all the goodies that come with SolrCloud itself.

SolrCloud – What happens when ZooKeeper fails?

Rafał Kuć — Mon, 02 Dec 2013 14:13:35 +0000

One of the questions I tend to get is what happens with SolrCloud cluster when ZooKeeper fails. Of course we are not talking about a single ZooKeeper instance failure, but the whole ensemble not being accessible and so the quorum not present. Because the answer to this question is very easy to verify i decided to make a simple blog post to show what happens when ZooKeeper fails.

Test environment

The test environment was very simple:

A single virtual machine running under Linux operating system
A single instance of ZooKeeper (which will be suitable for our test)
Two Solr instances with a single collection deployed
Solr 4.6

In order to create our test collection I’ve uploaded the configuration to ZooKeeper and used the following command:

curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=1'

The cloud view of the example cluster was as follows:

Test data indexing

The next step in our test will be indexing. We will index a few example documents that are provided with Solr in the exampledocs directory. The following commands were used to index the data:

curl 'localhost:8983/solr/collection1/update?commit=true' --data-binary @mem.xml -H 'Content-type:application/xml'
curl 'localhost:8983/solr/collection1/update?commit=true' --data-binary @monitor.xml -H 'Content-type:application/xml'
curl 'localhost:8983/solr/collection1/update?commit=true' --data-binary @monitor2.xml -H 'Content-type:application/xml'

After executing the above commands we get the following number of documents:

The whole collection holds 5 documents
Shard located on Solr running on port 8983 host 1 document
Shard located on Solr running on port 7983 has 4 documents

Querying with ZooKeeper not present

Now we go to the next step – we shutdown our ZooKeeper instance and we try to run a simple query by sending the following command:

curl 'localhost:8983/solr/collection1/select?q=*:*&indent=true'

In result we get the following response:



 
  0
  16
  
   true
   *:*
  
 


 TWINX2048-3200PRO 
 CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail
 Corsair Microsystems Inc.
 corsair
 
  electronics
  memory
 
 
  CAS latency 2,    2-3-3-6 timing, 2.75v, unbuffered, heat-spreader
 
 185.0
 185,USD
 5
 true
 37.7752,-122.4232
 2006-02-13T15:26:37Z
 electronics|6.0 memory|3.0
 1453219034197655552


 VS1GB400C3
 CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail
 Corsair Microsystems Inc.
 corsair
 
  electronics
  memory
 
 74.99
 74.99,USD
 7
 true
 37.7752,-100.0232
 2006-02-13T15:26:37Z
 electronics|4.0 memory|2.0
 1453219034252181504


 VDBDB1A16
 A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM
 A-DATA Technology Inc.
 corsair
 
  electronics
  memory
 
 
  CAS latency 3,     2.7v
 
 0
 true
 45.18414,-93.88141
 2006-02-13T15:26:37Z
 electronics|0.9 memory|0.1
 1453219034255327232


 3007WFP
 Dell Widescreen UltraSharp 3007WFP
 Dell, Inc.
 dell
 
  electronics
  monitor
 
 
  30" TFT active matrix LCD, 2560 x 1600, .25mm dot pitch, 700:1 contrast
 
 USB cable
 401.6
 2199.0
 2199,USD
 6
 true
 43.17614,-90.57341
 1453219041357332480


 VA902B
 ViewSonic VA902B - flat panel display - TFT - 19"
 ViewSonic Corp.
 viewsonic
 
  electronics
  monitor
 
 
  19" TFT active matrix LCD, 8ms response time, 1280 x 1024 native resolution
 
 190.4
 279.95
 279.95,USD
 6
 true
 45.18814,-93.88541
 1453219045997281280

As we can see Solr responded correctly. This is because Solr already has the clusterstate.json file cached. To search Solr doesn’t need to update that file, so search should and is working as we could see.

Indexing with failed ZooKeeper

Without turning on our ZooKeeper instance we try to run the following command:

curl 'localhost:8983/solr/collection1/update?commit=true' --data-binary @hd.xml -H 'Content-type:application/xml'

The above command should result in indexing the contents of the hd.xml file. After a longer period of time Solr responds with the following information:



50315096Cannot talk to ZooKeeper - Updates are disabled.503

So as you can see we are not able to index data without working ZooKeeper ensemble.

Starting ZooKeeper again

So let’s see what will happen when we start our ZooKeeper instance again without restarting Solr nodes. After starting ZooKeeper we try to run the same indexing command, we just did, once again:

curl 'localhost:8983/solr/collection1/update?commit=true' --data-binary @hd.xml -H 'Content-type:application/xml'

And this time the response is different:

As we can see the indexing request was successful this time. This allows us to assume that the connection to ZooKeeper was re-established by Solr. We can see that in Solr and ZooKeeper logs.

Short summary

As you can see, our short test allowed to see what happens when our ZooKeeper ensemble fails and what we can expect from Solr in such rare cases. I hope this blog entry will help you with some doubts about SolrCloud and its usefulnesses.

Please also remember that during the test, the cluster state did not change – all shards were accessible and working. We will see what will be happening when shards or replicas fails when ZooKeeeper is down in the next blog entry about SolrCloud.

SolrCloud HOWTO

Marek Rogoziński — Mon, 11 Mar 2013 12:52:53 +0000

What is the most important change in 4.x version of Apache Solr? I think there are many of them but Solr Cloud is definitely something that changed a lot in Solr architecture. Until now, bigger installations suffered from single point of failure (SPOF) – there was only the one master server and when this server was going down, the whole cluster lose the ability to receive new data. Of course you could go for multiple masters, where a single master was responsible for indexing some part of the data, but still, there was a SPOF present in your deployment. Even if everything worked, due to commit interval and the fact that slave instances checked the presence of new data periodically, the solution was far from ideal – the new data in the cluster appeared minutes after commit.

Solr Cloud changed this behavior. In this article we will setup a new SolrCloud cluster from the scratch and we will see how it work.

Our example cluster

In our example we will use three Solr servers. Every server in the cluster is capable of handling both the index and the query requests. This is the main difference from the old-fashioned Solr architecture with single master and multiple slave servers. In the new architecture there is one additional element present: Zookeeper, which is responsible for holding configuration of the cluster and for synchronization of its work. It is crucial to understand that Solr relies on information stored in Zookeeper – if Zookeeper will fail, the whole cluster is useless. Because of this it is very important to have a fault tolerant Zookeeper ensemble and because of this we use three independent instances of Zookeeper that will form the ensemble.

Zookeeper installation

As we said previously, Zookeeper is a vital part of SolrCloud cluster. Although we can use embedded Zookeeper, this is only handy for testing. For production you definitely want your Zookeeper to be installed independently from Solr and run in a different Java virtual machine process to avoid those two interrupting each other and influencing each others work.

The installation of Apache Zookeeper is straight forward and may be described by the following steps:

Download Zookeeper archive from: http://www.apache.org/dyn/closer.cgi/zookeeper/
Unpack downloaded archive and copy conf/zoo_sample.cfg to conf/zoo.cfg
Modify zoo.cfg:
1. Change dataDir to directory where you want to hold all cluster configuration data
2. Add information about all Zookeeper servers (see below)

After mentioned changes my zoo.cfg looks like the following one:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/zookeeper/data
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888

Copy this archive to the all servers, where Zookeeper service should be run
Create file /var/zookeeper/data/myid with server identifier. This identifier is different for each instance (for example on zk2 this file should contain 2 number)
Start all instances using “bin/zkServer.sh start-foreground” and verify validity of the installation
Add “bin/zkServer.sh start” to starting scripts and make sure that operation system monitors that Zookeeper service is available.

Solr installation

The installation of Solr is the following:

Download Solr archive from: http://www.apache.org/dyn/closer.cgi/lucene/solr/4.1.0
Unpack downloaded archive
In this tutorial we will use the ready Solr installation from the example directory and all changes are made to this example installation
Copy archive to all servers which are the part of the cluster

Install to Zookeeper configuration data, which will be used by the Solr cluster. For this run the first instance with:

java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=solr1 -DzkHost=zk1:2181 -DnumShards=2 -jar start.jar

This should be run only once. The next run will use configuration from Zookeeper cluster and local configuration files are not needed.

Run all instances using

java –DzkHost=zk1:2181 –jar start.jar

Verify the installation

Go into administration panel on any Solr instance. For our deployment the URL should be like http://solr1:8983/solr. When you click on cloud tab, and graph, you should see something similar to the following screen shot:

Collection

Our first collection – the collection1 is divided into two shards (shard1 and shard2). Each of those shards is placed on two Solr instances (OK, on the picture you see that every Solr is placed on the same host – I have currently only one physical server available for tests – any volunteers for donation? ;)). You can see that type of the dot tell us if it is a primary shard or replica.

Summary

I hope this is the first note about solrCloud. I know it is very short and skips details and information about shards, replicas and architecture of this solution. Treat this as a simple checklist for basic, (but real) configuration of your cloud.