{"id":550,"date":"2013-05-06T13:57:20","date_gmt":"2013-05-06T11:57:20","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=550"},"modified":"2020-11-12T13:58:00","modified_gmt":"2020-11-12T12:58:00","slug":"solr-4-3-shard-splitting-quick-look","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2013\/05\/06\/solr-4-3-shard-splitting-quick-look\/","title":{"rendered":"Solr 4.3: shard splitting quick look"},"content":{"rendered":"<p>With the release of Solr 4.3 we&#8217;ve got a long awaited feature &#8211; we can now split shards of collections that were already created and have data (in SolrCloud type deployment). In this entry we would like to try that feature and see how it works. So let&#8217;s do it.<\/p>\n\n\n<!--more-->\n\n\n<h3>A few words before we try<\/h3>\n<p>Choosing the right amount of shard a collection should be built of is one of those variables that needs to be known before the final deployment. After our collection was created we couldn&#8217;t change the number of shard, we were only able to add more replicas. Of course that came with consequences &#8211; if we&#8217;ve chosen the number of shards wrong we could end up with too low shards count and the only way to go was creating a new collection with the proper amount of shard and re-index our data. With the release of Apache Solr 4.3 we are now able to split shards of our collections.<\/p>\n<h3>Small cluster<\/h3>\n<p>In order to test the new shard split functionality I decided to run a small and simple cluster containing a single Solr instance with the embedded ZooKeeper and use the example collection provided with Solr. In order to achieve that I&#8217;ve run the following command:\n<\/p>\n<pre class=\"brush:bash\">java -Dbootstrap_confdir=.\/solr\/collection1\/conf -Dcollection.configName=collection1 -DzkRun -DnumShards=1 -DmaxShardsPerNode=2 -DreplicationFactor=1 -jar start.jar<\/pre>\n<p>After launching the mini cluster its view was as follows:<\/p>\n<p><a href=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/after_start1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3080\" alt=\"after_start\" src=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/after_start1.png\" width=\"524\" height=\"27\"><\/a><\/p>\n<h3>Test data<\/h3>\n<p>As usual we need some data for tests and I decided to use the example data provided with Solr. In order to index them I&#8217;ve run the following command in the <em>exampledocs <\/em>directory:\n<\/p>\n<pre class=\"brush:bash\">java -jar post.jar *.xml<\/pre>\n<p>The number of indexed documents were checked with the following command:\n<\/p>\n<pre class=\"brush:bash\">curl 'http:\/\/localhost:8983\/solr\/collection1\/select?q=*:*&amp;rows=0'<\/pre>\n<p>The response returned by Solr was as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;5&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n    &lt;str name=\"q\"&gt;*:*&lt;\/str&gt;\n    &lt;str name=\"rows\"&gt;0&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;result name=\"response\" numFound=\"32\" start=\"0\"&gt;\n&lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As you can see we&#8217;ve got 32 documents in our collection.<\/p>\n<h3>Shard split<\/h3>\n<p>So now let&#8217;s try to divide the single shard our collection is built of. In order to do that we will use Collections API and a new &#8211; SPLITSHARD action. In its simplest form it takes two parameters &#8211; <em>collection<\/em> which is the collection name we want to divide and the <em>shard<\/em> which is the name of the shard we want to split. So in our case, the command that will split the shard looks like this:\n<\/p>\n<pre class=\"brush:bash\">curl 'http:\/\/localhost:8983\/solr\/admin\/collections?action=SPLITSHARD&amp;collection=collection1&amp;shard=shard1'<\/pre>\n<p>If everything run without a problem, after a few seconds we will get response form Solr that indicates the end of the process. The response will look more or less like this:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;9220&lt;\/int&gt;\n&lt;\/lst&gt;\n&lt;lst name=\"success\"&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;6963&lt;\/int&gt;\n    &lt;\/lst&gt;\n    &lt;str name=\"core\"&gt;collection1_shard1_1_replica1&lt;\/str&gt;\n    &lt;str name=\"saved\"&gt;\/home\/solr\/4.3\/solr\/solr.xml&lt;\/str&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;6977&lt;\/int&gt;\n    &lt;\/lst&gt;\n    &lt;str name=\"core\"&gt;collection1_shard1_0_replica1&lt;\/str&gt;\n    &lt;str name=\"saved\"&gt;\/home\/solr\/4.3\/solr\/solr.xml&lt;\/str&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;9005&lt;\/int&gt;\n    &lt;\/lst&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;9006&lt;\/int&gt;\n    &lt;\/lst&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;103&lt;\/int&gt;\n    &lt;\/lst&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;1&lt;\/int&gt;\n    &lt;\/lst&gt;\n    &lt;str name=\"core\"&gt;collection1_shard1_1_replica1&lt;\/str&gt;\n    &lt;str name=\"status\"&gt;EMPTY_BUFFER&lt;\/str&gt;\n  &lt;\/lst&gt;\n  &lt;lst&gt;\n    &lt;lst name=\"responseHeader\"&gt;\n      &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"QTime\"&gt;1&lt;\/int&gt;\n    &lt;\/lst&gt;\n    &lt;str name=\"core\"&gt;collection1_shard1_0_replica1&lt;\/str&gt;\n    &lt;str name=\"status\"&gt;EMPTY_BUFFER&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;\/response&gt;<\/pre>\n<h3>Cluster after the split<\/h3>\n<p>After the split our cluster view will look like this:<\/p>\n<p><a href=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/after_split.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3082\" alt=\"after_split\" src=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/after_split.png\" width=\"520\" height=\"80\"><\/a>As we can see we have two new shards. In theory each of the new shards should contain a portion of documents from the original <em>shard1 <\/em>&#8211; some of the documents should be placed in&nbsp; <em>shard1_1 <\/em>and some in <em>shard1_0<\/em>. Again using Solr administration panel we can check each of the cores (which are the actual shards):<\/p>\n<h4>Shard1_1<\/h4>\n<p>Statistics for shard with the name of <em>Shard1_1<\/em> are as follows:<\/p>\n<p><a href=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/shard_1_1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3083\" alt=\"shard_1_1\" src=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/shard_1_1.png\" width=\"415\" height=\"210\"><\/a><\/p>\n<p><strong>Shard1_0<\/strong><\/p>\n<p>And the statistics for shard with the name of <em>Shard1_0<\/em> are as follows:<\/p>\n<h3><a href=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/shard_1_0.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3084\" alt=\"shard_1_0\" src=\"http:\/\/solr.pl\/wp-content\/uploads\/2013\/05\/shard_1_0.png\" width=\"352\" height=\"207\"><\/a><\/h3>\n<p>As you can see we have 32 documents in total, which is the same as in the original collection.<\/p>\n<h3>Cleaning up<\/h3>\n<p>I&#8217;ve left the cleaning up for the end. First of all in order to see the data in new shards we need to run the commit command against our collection. For example, this can be done by using the following command:\n<\/p>\n<pre class=\"brush:bash\">curl 'http:\/\/localhost:8983\/solr\/collection1\/update' --data-binary '&lt;commit\/&gt;' -H 'Content-type:application\/xml'<\/pre>\n<p>In addition to that we can also remove the original shard, for example by using Solr administration panel or by using the CoreAPI.<\/p>\n<h3>Final test<\/h3>\n<p>As a summary I decided to test if the documents are available in the shards created by the SPLITSHARD action. In order to do that I&#8217;ve used the following command:\n<\/p>\n<pre class=\"brush:bash\">curl 'http:\/\/localhost:8983\/solr\/collection1\/select?q=*:*&amp;rows=100&amp;fl=id,[shard]&amp;indent=true'<\/pre>\n<p>And Solr responded in the following way:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;7&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n    &lt;str name=\"fl\"&gt;id,[shard]&lt;\/str&gt;\n    &lt;str name=\"q\"&gt;*:*&lt;\/str&gt;\n    &lt;str name=\"rows\"&gt;100&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;result name=\"response\" numFound=\"32\" start=\"0\" maxScore=\"1.0\"&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;GB18030TEST&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;IW-02&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;MA147LL\/A&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;adata&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;asus&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;belkin&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;maxtor&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;TWINX2048-3200PRO&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;VS1GB400C3&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;VDBDB1A16&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;USD&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;GBP&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;3007WFP&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;EN7800GTX\/2DHTV\/256M&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_0_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;SP2514N&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;6H500F0&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;F8V7067-APL-KIT&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;apple&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;ati&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;canon&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;corsair&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;dell&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;samsung&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;viewsonic&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;EUR&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;NOK&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;VA902B&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;0579B002&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;9885A004&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;SOLR1000&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;UTF8TEST&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"id\"&gt;100-435805&lt;\/str&gt;\n    &lt;str name=\"[shard]\"&gt;192.168.56.1:8983\/solr\/collection1_shard1_1_replica1\/&lt;\/str&gt;&lt;\/doc&gt;\n&lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As you can see documents came from both shard, which is again what we expected. Please remember that this is only a sample usage and we will get back to shard split topic for sure.<\/p>","protected":false},"excerpt":{"rendered":"<p>With the release of Solr 4.3 we&#8217;ve got a long awaited feature &#8211; we can now split shards of collections that were already created and have data (in SolrCloud type deployment). In this entry we would like to try that<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[511,510,164,512],"class_list":["post-550","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-shard-2","tag-shard-spliting-2","tag-solr-2","tag-split-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/550","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=550"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/550\/revisions"}],"predecessor-version":[{"id":551,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/550\/revisions\/551"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=550"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=550"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=550"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}