Solr and autocomplete (part 3)

In the previous parts (part 1, part. 2) of the cycle, we learned how to configure and query Solr to get the autocomplete functionality. In today’s entry I will show you how to add the dictionary to the Suggester, and thus have an impact on the generated suggestions.

Component configuration

To configure the component presented in the previous part of the cycle add the following parameter:

Thus our configuration should look like this:

With the parameter we informed the component to use the dictionary named dict.txt which should be placed in the Solr configuration directory.

Handler configuration

The handler configuration also gets one additional parameter which is:

So our configuration should be as follows:

This parameter tell Solr, to return only those suggestions for which the number of results is greater than the number of results for the current query.

Dictionary

We told Solr to use the dictionary, but how should this dictionary look like ? For the purpose of this post I defined the following dictionary:

What is the construction of a dictionary? Each of the phrases (or single words) is located in a separate line. Each line ends with the weight of the phrase (between the weight and the phrase is a TAB character) which is used together with the parameter spellcheck.onlyMorePopular=true (the higher the weight, the higher the suggestion will be). The default weight value is 1.0. A dictionary should be saved in UTF-8 encoding. Lines beginning with # character are skipped.

Data

In this case we don’t need data – we will only use the defined dictionary.

Let’s check how it works

To check how our mechanism behaves I sent the following query to Solr, of course after rebuilding of the Suggester index:

/suggest?q=Har

As a result we get the following:

A few words at the end

As you can see the suggestions are sorted by on the basis of weight, as expected. It is worth noting that the query was passed with a capital letter, which is also important – the lowercased query will return empty suggestion list.

What can you say about the method – if we have a very good dictionaries generated on the basis of weights such as customer behavior this is the method for you and your customers will love it. I would not recommend it if you don’t have good dictionaries – there is a very high chance that your suggestions will be of poor quality.

What will be next ?

The number of tasks this week didn’t let me finish the performance tests and that’s why, in the next part of the cycle, I’ll try to show you how each method behaves with various index structure and size.

3 thoughts on “Solr and autocomplete (part 3)

  • 20 August 2015 at 08:51
    Permalink

    I was looking at the performance metrics.
    Were you able to work on that ? can you please provide the link

    Reply
  • 10 September 2015 at 22:57
    Permalink

    Hi,

    How do you maintain the dictionary? are you manually adding terms/phrases constantly based on what you think are the popular searches?

    Or also, can you use a schema field as the input for the dictionary?

    Thanks!

    Pablo

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.