Solr 4.1: SolrCloud – multiple shards on the same Solr node
We would like to discuss another new feature that will be a part of upcoming Solr 4.1 – the ability to place more than one shard of a given collection on a single Solr instance. As you may know this is not possible currently. So, lets look how this works by comparing Solr 4.1 to 4.0.
In order to illustrate how this feature works I decided to see how the process of creating a new collection looks like. We will use a single Solr instance and our collection with be built of two shards.
Solr 4.0
Solr.xml
What we need to do first is clean the solr.xml file, so it doesn’t have any information about cores. Our solr.xml file should look like this:
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
<cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
</cores>
</solr>
Starting Solr
Now we need to run a single Solr instance with embedded ZooKeeper. We do that, by running the following command:
java -DzkRun -jar start.jar
Preparing configuration
Before creating our collection we need to send all the needed configuration file to ZooKeeper. Assuming that we have Solr installed in /home/solrpl/solr/ directory and that we have our configuration files stored in /home/solrpl/configs/collection1/conf directory I run the following script that is distributed with Solr 4.0:
/home/solrpl/solr/cloud-scripts/zkcli.sh -cmd upconfig -zkhost localhost:9983 -confdir /home/solrpl/configs/collection1/conf/ -confname collection1
Creating the collection
We should have our configuration files stored in ZooKeeper so now we can use the collections API to create our collection. In order to do the we run a query to Solr to the /solr/admin/collections endpoint with the action=CREATE parameter that tells Solr that we want to create new collection. We also need to provide the name of the collection by adding the name=collection1 parameter. In addition to that we inform Solr that we want to have our collection divided into two shards (numShard=2) and we don’t want any replicas (replicationFactor=0). So the full request looks like this:
curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=0'
Solr administration panel view
If you would repeat the above steps and look at the cloud view in Solr administration panel you would see something like this:
![]()
Comments
As you can see Solr 4.0 didn’t place both shards on a single machine. The shard named shard1 was placed on the xubuntu-virtual node, but the one called shard2 was not assigned. Of course that would change if we had more nodes forming the cluster, but that’s not the point of this entry.
Solr 4.1
Solr.xml
Similar to what we did with Solr 4.0 we start with cleaning the solr.xml file, which should look like this:
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
<cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
</cores>
</solr>
Starting Solr
We do exactly the same when starting Solr 4.1, so we run the following command:
java -DzkRun -jar start.jar
Preparing configuration
Similar to what we did with Solr 4.0, we need to send our configuration to ZooKeeper. We do that by running exactly the same command as we did before:
/home/solrpl/solr/cloud-scripts/zkcli.sh -cmd upconfig -zkhost localhost:9983 -confdir /home/solrpl/configs/collection1/conf/ -confname collection1
Collection creation
Creating our collection will be a bit different this time. We send the same values of parameters like action, collection and numShards. However we add a new parameter the maxShardsPerNode one that specifies the maximum number of shards that can be placed on a single Solr instance (by default this value is set to 1). In our case we want to have two shards on a single Solr node so we set this parameter to 2. In addition to that Solr forces us to have at least a single replica, so we need to set the replicationFactor parameter to 1. The whole query looks like this:
curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=1&maxShardsPerNode=2'
Solr administration panel view
After all the above steps the cloud view in Solr administration panel looks like this:

Comments
As you can see, with Solr 4.1 we were able to create collection built of two shards and place both of them on a single Solr node. So if you need to have this kind of functionality you can wait for Solr 4.1 and be sure that it will be working.
This post is also available in: Polish

January 24th, 2013 at 21:50
Hi,
I followed your example precisely. But after running “Preparing Configuration” section of Solr4.1 I got the “Missing required parameter: name” error.
Do you know what cause it?
Thanks
January 24th, 2013 at 22:43
Sorry, there was a mistake in the examples (somehow the script that formats the code snippets didn’t want to show one parameter) and they are updated now. Just add the name=collection1 parameter to your curl command and it should work without any problems.
January 25th, 2013 at 03:39
There is actually a typo collection=collection1 should be name=collection1.
January 25th, 2013 at 03:40
Thanks gr0, and I realized that as well.
January 25th, 2013 at 08:15
Your are right. Thanks
January 25th, 2013 at 16:14
This morning, I tried to restart the server. But I got “java.net.BindException: Address already in use” error while binding to port 0.0.0.0/0.0.0.0:9983.
Any advice?
Thanks in advance.
January 25th, 2013 at 19:15
That should only happen when you have some application already listening on that port. Please check that, maybe your Solr was already running ?
January 28th, 2013 at 08:11
The param “maxshardspernode“ seems not work out….
I use the default maxshardspernode = 1,but it still create two shards in one instance…
Is this a bug?
please help!! email address chenlm20042004@163.com
January 28th, 2013 at 10:03
I think you are doing something wrong. When trying to create two shards and a single node with the following request:
curl ‘http://localhost:8983/solr/admin/collections?action=CREATE&collection=collection1&numShards=2&replicationFactor=1&maxShardsPerNode=1′
I get the following exception in Solr logs, which is expected:
SEVERE: Cannot create collection collection1. Value of maxShardsPerNode is 1, and the number of live nodes is 1. This allows a maximum of 1 to be created. Value of numShards is 2 and value of replicationFactor is 1. This requires 2 shards to be created (higher than the allowed number)
When trying to set the replicationFactor parameter to 0 it also says its not allowed:
SEVERE: replicationFactor must be > 0
Please check if you are doing everything right.
February 26th, 2013 at 07:29
Is it possible to load balance the documents across the shards manually when I have single instance and multiple shards? anything by document size or document count
April 17th, 2013 at 15:19
Hi gr0, nice writeup… Just a question. What are the benefits of having multiple shards per node? Does it improve performance?
Also, if I had a 3 node cluster, would the create command with 6 shards automatically put two on each node? How would replication be handled in this scenario?
April 17th, 2013 at 16:44
One of the benefits of creating more shards than the actual nodes is that in the future you’ll be able to expand those to new nodes. Imagine a situation where we know that in the future we will need more servers because the X that we know have will not be enough and re-indexing all is not an option. In such case we can create more shards per node now and move them to new servers in the future.