{"id":980,"date":"2019-01-14T15:27:53","date_gmt":"2019-01-14T14:27:53","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=980"},"modified":"2020-11-14T15:28:22","modified_gmt":"2020-11-14T14:28:22","slug":"solrcloud-and-query-execution-control","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2019\/01\/14\/solrcloud-and-query-execution-control\/","title":{"rendered":"SolrCloud and query execution control"},"content":{"rendered":"\n<p>With the release of Solr 7.0 and introduction of new replica types, in addition to the defa?ult NRT type the question appeared &#8211; can we control the queries and where they are executed? Can we tell Solr to execute the queries only on the PULL replicas or give TLOG replicas a priority? Let&#8217;s check that out.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Shards parameter<\/h2>\n\n\n\n<p>The first control option that we have in SolrCloud is the <em>shards<\/em> parameter. Using it we can directly control which shards should be used for querying. For example we can provide a logical shard name in our query:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards=shard1<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards=shard1,shard2,shard3<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards=localhost:6683\/solr\/test<\/code><\/pre>\n\n\n\n<p>The first of the above queries will be executed only on those shards that are grouped under the logical <em>shard1<\/em> name. The second query will be executed on logical <em>shard1<\/em>, <em>shard2<\/em> and <em>shard3<\/em>, while the third query will be executed on the shards that are deployed on the <em>localhost:6683<\/em> node on the <em>test<\/em> collection<\/p>\n\n\n\n<p>There is also a possibility to do load balancing across instances, for example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards=localhost:6683\/solr\/test|localhost:7783\/solr\/test<\/code><\/pre>\n\n\n\n<p>The above query will be executed on instance running on port <em>6683<\/em> or on the one running on port <em>7783<\/em>. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Shards.preference parameter<\/h2>\n\n\n\n<p>While the <em>shards<\/em> parameter gives us some degree of control where the query should be executed it is not exactly what we would like to have. However to use a certain type of replica we would have to get the data about the physical layout of the shards and this is not something that we would like to do. Because of that the <em>shards.preference<\/em> parameter has been introduced to Solr. It allows us to tell Solr what type of replicas should have the priority when executing query. <\/p>\n\n\n\n<p>For example, to tell Solr that PULL type replicas should have priority when the query is executed one should add the <em>shards.preference<\/em> parameter to the query and set it to <em>replica.type:PULL<\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards.preference=replica.type:PULL<\/code><\/pre>\n\n\n\n<p>The nice thing is that we can tell Solr that first PULL replicas should be used and then if they are not available TLOG replicas should be used:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards.preference=replica.type:PULL,replica.type:TLOG<\/code><\/pre>\n\n\n\n<p>We can also define that PULL types replicas should be used first and if they are not available local shards should have the priority:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards.preference=replica.type:PULL,replica.location:local<\/code><\/pre>\n\n\n\n<p>In addition to the above example we can also define priority based on location of the replicas. For example if our <em>192.168.1.1<\/em> Solr node is way more powerful compared to the others and we would like to first prioritize PULL replicas and then the mentioned Solr node we would run the following query:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">http:\/\/localhost:8983\/solr\/test\/select?q=*:*&amp;shards.preference=replica.type:PULL,replica.location:http:\/\/192.168.1.1<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>The discussed parameters and the <em>shards.preference<\/em> in particular with its <em>replica.type<\/em> value can be very useful when we are using SolrCloud with different types of replicas. Telling Solr that we would like to prefer PULL or TLOG replicas we can lower the query based pressure on the NRT replicas and thus have better performance of the whole cluster. What&#8217;s more &#8211; dividing the replicas can help us in achieving query performance that is close what Solr master &#8211; slave architecture provides without sacrificing all the goodies that come with SolrCloud itself. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>With the release of Solr 7.0 and introduction of new replica types, in addition to the defa?ult NRT type the question appeared &#8211; can we control the queries and where they are executed? Can we tell Solr to execute the<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[613,614,165,164,509],"class_list":["post-980","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-control","tag-execution","tag-query-2","tag-solr-2","tag-solrcloud-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/980","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=980"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/980\/revisions"}],"predecessor-version":[{"id":981,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/980\/revisions\/981"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=980"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=980"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=980"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}