{"id":156,"date":"2010-11-29T23:38:55","date_gmt":"2010-11-29T22:38:55","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=156"},"modified":"2020-11-10T23:39:44","modified_gmt":"2020-11-10T22:39:44","slug":"solr-and-autocomplete-part-3","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2010\/11\/29\/solr-and-autocomplete-part-3\/","title":{"rendered":"Solr and autocomplete (part 3)"},"content":{"rendered":"<p>In  the previous parts (<a href=\"http:\/\/solr.pl\/en\/2010\/10\/18\/solr-and-autocomplete-part-1\/\">part 1<\/a>, <a href=\"http:\/\/solr.pl\/en\/2010\/11\/15\/solr-and-autocomplete-part-2\/\">part. 2<\/a>) of the cycle, we learned how to  configure and query Solr to get the autocomplete functionality. In  today&#8217;s entry I will show you how to add the dictionary to the  Suggester, and thus have an impact on the generated suggestions.<\/p>\n\n\n<!--more-->\n\n\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-483\" title=\"Google Autocomplete\" src=\"http:\/\/solr.pl\/wp-content\/uploads\/2010\/10\/google_autocomplete2.png\" alt=\"\" width=\"641\" height=\"159\"><\/p>\n<h3>Component configuration<\/h3>\n<p>To configure the component presented in the previous part of the cycle add the following parameter:\n<\/p>\n<pre class=\"brush:xml\">&lt;str name=\"sourceLocation\"&gt;dict.txt&lt;\/str&gt;<\/pre>\n<p>Thus our configuration should look like this:\n<\/p>\n<pre class=\"brush:xml\">&lt;searchComponent name=\"suggest\" class=\"solr.SpellCheckComponent\"&gt;\n &lt;lst name=\"spellchecker\"&gt;\n  &lt;str name=\"name\"&gt;suggest&lt;\/str&gt;\n  &lt;str name=\"classname\"&gt;org.apache.solr.spelling.suggest.Suggester&lt;\/str&gt;\n  &lt;str name=\"lookupImpl\"&gt;org.apache.solr.spelling.suggest.tst.TSTLookup&lt;\/str&gt;\n  &lt;str name=\"field\"&gt;name_autocomplete&lt;\/str&gt;\n  &lt;str name=\"sourceLocation\"&gt;dict.txt&lt;\/str&gt;\n &lt;\/lst&gt;\n&lt;\/searchComponent&gt;<\/pre>\n<p>With the parameter we informed the component to use the dictionary named <em>dict.txt<\/em> which should be placed in the Solr configuration directory.<\/p>\n<h3>Handler configuration<\/h3>\n<p>The handler configuration also gets one additional parameter which is:\n<\/p>\n<pre class=\"brush:xml\">&lt;str name=\"spellcheck.onlyMorePopular\"&gt;true&lt;\/str&gt;<\/pre>\n<p>So our configuration should be as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;requestHandler name=\"\/suggest\" class=\"org.apache.solr.handler.component.SearchComponent\"&gt;\n &lt;lst name=\"defaults\"&gt;\n  &lt;str name=\"spellcheck\"&gt;true&lt;\/str&gt;\n  &lt;str name=\"spellcheck.dictionary\"&gt;suggest&lt;\/str&gt;\n  &lt;str name=\"spellcheck.count\"&gt;10&lt;\/str&gt;\n  &lt;str name=\"spellcheck.onlyMorePopular\"&gt;true&lt;\/str&gt;\n &lt;\/lst&gt;\n &lt;arr name=\"components\"&gt;\n  &lt;str&gt;suggest&lt;\/str&gt;\n &lt;\/arr&gt;\n&lt;\/requestHandler&gt;<\/pre>\n<p>This parameter tell Solr, to return only those suggestions for which the number of results is greater than the number of results for the current query.<\/p>\n<h3>Dictionary<\/h3>\n<p>We told Solr to use the dictionary, but how should this dictionary look like ? For the purpose of this post I defined the following dictionary:\n<\/p>\n<pre class=\"brush:plain\"># sample dict\nHard disk hitachi\nHard disk wd    2.0\nHard disk jjdd    3.0<\/pre>\n<p>What is the construction of a dictionary? Each of the phrases (or single words) is located in a separate line. Each  line ends with the weight of the phrase (between the weight and the  phrase is a TAB character) which is used together with the parameter  <em>spellcheck.onlyMorePopular=true<\/em> (the higher the weight, the higher the  suggestion will be). The default weight value is 1.0. A dictionary should be saved in UTF-8 encoding. Lines beginning with # character are skipped.<\/p>\n<h3>Data<\/h3>\n<p>In this case we don&#8217;t need data &#8211; we will only use the defined dictionary.<\/p>\n<h3>Let&#8217;s check how it works<\/h3>\n<p>To check how our mechanism behaves I sent the following query to Solr, of course after rebuilding of the Suggester index:<\/p>\n<p><code>\/suggest?q=Har<\/code><\/p>\n<p>As a result we get the following:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;0&lt;\/int&gt;\n&lt;\/lst&gt;\n&lt;lst name=\"spellcheck\"&gt;\n  &lt;lst name=\"suggestions\"&gt;\n    &lt;lst name=\"Dys\"&gt;\n      &lt;int name=\"numFound\"&gt;3&lt;\/int&gt;\n      &lt;int name=\"startOffset\"&gt;0&lt;\/int&gt;\n      &lt;int name=\"endOffset\"&gt;3&lt;\/int&gt;\n      &lt;arr name=\"suggestion\"&gt;\n        &lt;str&gt;Hard disk jjdd&lt;\/str&gt;\n        &lt;str&gt;Hard disk hitachi&lt;\/str&gt;\n        &lt;str&gt;Hard disk wd&lt;\/str&gt;\n     &lt;\/arr&gt;\n    &lt;\/lst&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;\/response&gt;\n<\/pre>\n<h3>A few words at the end<\/h3>\n<p>As you can see the suggestions are sorted by on the basis of weight, as expected. It  is worth noting that the query was passed with a capital letter, which  is also important &#8211; the lowercased query will return empty suggestion  list.<\/p>\n<p>What  can you say about the method &#8211; if we have a very good dictionaries  generated on the basis of weights such as customer behavior this is the  method for you and your customers will love it. I  would not recommend it if you don&#8217;t have good dictionaries &#8211; there is a  very high chance that your suggestions will be of poor quality.<\/p>\n<h3>What will be next ?<\/h3>\n<p>The number of tasks this week didn&#8217;t let me finish the performance tests and that&#8217;s why, in the next part of the cycle, I&#8217;ll try to show you how each method behaves with various index structure and size.<\/p>","protected":false},"excerpt":{"rendered":"<p>In the previous parts (part 1, part. 2) of the cycle, we learned how to configure and query Solr to get the autocomplete functionality. In today&#8217;s entry I will show you how to add the dictionary to the Suggester, and<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[229,249,164,250,248,252],"class_list":["post-156","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-autocomplete","tag-component-2","tag-solr-2","tag-suggest-2","tag-suggest-component-2","tag-suggester-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/156","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=156"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/156\/revisions"}],"predecessor-version":[{"id":157,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/156\/revisions\/157"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=156"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=156"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=156"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}