{"id":194,"date":"2011-02-07T09:08:27","date_gmt":"2011-02-07T08:08:27","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=194"},"modified":"2020-11-11T09:09:03","modified_gmt":"2020-11-11T08:09:03","slug":"optimization-filter-cache","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2011\/02\/07\/optimization-filter-cache\/","title":{"rendered":"Optimization &#8211; filter cache"},"content":{"rendered":"<p>Today&#8217;s entry is  dedicated to one type of cache in the Solr &#8211; filter cache. I will try to  explain what it does, how to configure it and how to use it in an efficient  way.<\/p>\n\n\n<!--more-->\n\n\n<h3>What it is used for ?<\/h3>\n<p>Let&#8217;s start from the  inside. FilterCache stores  unordered collection of identifiers of documents. Of course, these  are not the IDs defined in the <em>schema.xml <\/em>file as a unique key &#8211; Solr stores the  internal IDs of the documents used by Lucene and Solr &#8211; it is worth  remembering.<\/p>\n<h3>What it is used for ?<\/h3>\n<p>The main task of  the <em>filterCache <\/em>is to keep results related to the use of filters.  Although it is not his  only use. In  addition, the cache can serve as an aid for faceting mechanism (if using the  <em>TermEnum<\/em> method), and for sorting when <em>&lt;useFilterForSortedQuery\/&gt;<\/em> option  is set to <em>true<\/em> in the <em>solrconfig.xml<\/em> file.<\/p>\n<h3>Definition<\/h3>\n<p>FilterCache standard definition is as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;filterCache\n      class=\"solr.FastLRUCache\"\n      size=\"16384\"\n      initialSize=\"4096\"\n      autowarmCount=\"4096\" \/&gt;<\/pre>\n<p>You have the following  configuration options:<\/p>\n<ul>\n<li><em>class <\/em>&#8211; class is responsible for implementation. For <em>filterCache <\/em>recommend using<em> solr.FastLRUCache<\/em>, which  is characterized by greater efficiency in a larger number of operations GET, PUT  than that.<\/li>\n<li><em>size <\/em>&#8211; the maximum number of entries that can be found in the cache.<\/li>\n<li><em>initialSize <\/em>&#8211; initial size of the cache.<\/li>\n<li><em>autowarmCount <\/em>&#8211; the number of entries that will be transcribed during the warm-up from the old  to the new cache.<\/li>\n<li><em>minSize <\/em>&#8211; value specifying to which the number of entries Solr will try to reduce the  cache in case of full restoration.<\/li>\n<li><em>acceptableSize <\/em>&#8211; if Solr will not be able to bring the number of entries to the specified by  parameter <em>minSize<\/em>, the value <em>acceptableSize <\/em>will be the one to which it will  seek a new one.<\/li>\n<li><em>cleanupThread <\/em>&#8211; the default value is false. If set to <em>true <\/em>to clean the cache will be used a separate  topic.<\/li>\n<\/ul>\n<p>In most cases, the  use of <em>size<\/em> , and <em>initialSize <\/em>and <em>autowarmCount <\/em>parameters is quite  sufficient.<\/p>\n<h3>How to configure ?<\/h3>\n<p>The size of  the cache should be determined on the basis of queries that are sent to  Solr. The maximum size <em> filterCache<\/em> should be at least as large as the number of filters (with values)  that we use. This means  that if your application is, in a given period of time, using 2000 for example  (<em>fq<\/em> parameters with values), the size parameter should be set to a minimum value  of 2000.<\/p>\n<h3>Efficient use<\/h3>\n<p>However, the configuration of the cache is not sufficient &#8211; we need to make the query to be able to use it. Take the following query for example:\n<\/p>\n<pre class=\"brush:xml\">q=name:solr+AND+category:ksiazka+AND+section:ksiazki<\/pre>\n<p>At first  glance, the query is the correct. However, there is  a problem &#8211; it does not use <em>filterCache<\/em>. The  entire request will be handled by <em>queryResultCache <\/em>and will create a single  entry in it. Let&#8217;s  modify it a bit and send the following query.\n<\/p>\n<pre class=\"brush:xml\">q=name:solr&amp;fq=category:ksiazka&amp;fq=section:ksiazki<\/pre>\n<p>What happens  now? As in the previous  case, an entry will be created in <em>queryResultCache<\/em>. Additionaly there will be  two entries in <em>filterCache <\/em>created. Now let&#8217;s look at the next query:\n<\/p>\n<pre class=\"brush:xml\">q=name:lucene&amp;fq=category:ksiazka&amp;fq=section:ksiazki<\/pre>\n<p>This query would create another entry in the <em>queryResultCache <\/em>and would use two already existing entries in the <em>filterCache<\/em>. Thus the execution time of the query would be reduced and the query would be less demanding for the I\/O.<\/p>\n<p>However, let&#8217;s look at the query in the following form:\n<\/p>\n<pre class=\"brush:xml\">q=name:lucene+AND+category:ksiazka+AND+section:ksiazki<\/pre>\n<p>Solr would not be able to use any information from the cache and would have to collect all the information for the results of the Lucene index.<\/p>\n<h3>Last few words<\/h3>\n<p>As you  can see, the correct way to configure cache is not what guarantee that Solr will  be able to use it. The  efficiency of the target implementation depends on how the queries are send to  Solr. It is worth  remembering when planning implementation.<\/p>","protected":false},"excerpt":{"rendered":"<p>Today&#8217;s entry is dedicated to one type of cache in the Solr &#8211; filter cache. I will try to explain what it does, how to configure it and how to use it in an efficient way.<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[281,292,181,289,291,290,164],"class_list":["post-194","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-cache-2","tag-caching","tag-filter","tag-filter-cache-2","tag-filtercache-2","tag-filtering","tag-solr-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/194","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=194"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/194\/revisions"}],"predecessor-version":[{"id":195,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/194\/revisions\/195"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=194"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=194"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=194"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}