dismax – Solr.pl

What can we use Dismax tie parameter for?

Rafał Kuć — Mon, 06 Feb 2012 22:40:53 +0000

Dismax query parser have been with Solr for a long time. Most of the time we use parameters like qf, pf or mm forgetting about a very useful parameter which allows us to control how the lower scoring fields are treated – about the tie parameter.

Tie

The tie parameter allows one to control how the lower scoring fields affects score for a given word. If we set the tie parameter to a 0.0 value, during the score calculation, only the fields that were scored highest will matter. However if we set it to 0.99 the fields scoring lower will have almost the same impact on the score as the highest scoring field. So let’s check if that actually works.

Data structure and data example

To test how the tie parameter works I’ve chosen a simple index structure which would describe products in e-commerce shop, of course in a simple mode:

The text_ws type was defined as follows:

The example documents look like this:


 
  1
  First test book
  This is a description of the first test book by Joe and Jane Blow
  Joe Blow
  Jane Blow
 
 
  2
  Second test book
  This is a description of the second test book by Joe Blow
  Joe Blow

Tie == 0.01 result

Let’s start the test. The first query was the following one:

defType=dismax&qf=title^1000 description author^10&tie=0.01&fl=id,score&debugQuery=on&indent=true&q=joe blow book

The above resulted in the following Solr results (visualization – http://explain.solr.pl/explains/cf0wnkpj):




  0
  8
  
    id,score
    on
    true
    0.01
    joe blow book
    title^1000 description author^10
    dismax
  


  
    0.07342677
    2
  
  
    0.073365316
    1
  


  joe blow book
  joe blow book
  +((DisjunctionMaxQuery((author:joe^10.0 | title:joe^1000.0 | description:joe)~0.01) DisjunctionMaxQuery((author:blow^10.0 | title:blow^1000.0 | description:blow)~0.01) DisjunctionMaxQuery((author:book^10.0 | title:book^1000.0 | description:book)~0.01))~3) ()
  +(((author:joe^10.0 | title:joe^1000.0 | description:joe)~0.01 (author:blow^10.0 | title:blow^1000.0 | description:blow)~0.01 (author:book^10.0 | title:book^1000.0 | description:book)~0.01)~3) ()
  
    
0.07342677 = (MATCH) sum of:
  0.07342677 = (MATCH) sum of:
    8.957935E-4 = (MATCH) max plus 0.01 times others of:
      8.9543534E-4 = (MATCH) weight(author:joe^10.0 in 1), product of:
        0.0024097771 = queryWeight(author:joe^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.3715843 = (MATCH) fieldWeight(author:joe in 1), product of:
          1.0 = tf(termFreq(author:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.625 = fieldNorm(field=author, doc=1)
      3.5817415E-5 = (MATCH) weight(description:joe in 1), product of:
        2.4097772E-4 = queryWeight(description:joe), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:joe in 1), product of:
          1.0 = tf(termFreq(description:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)
    8.957935E-4 = (MATCH) max plus 0.01 times others of:
      8.9543534E-4 = (MATCH) weight(author:blow^10.0 in 1), product of:
        0.0024097771 = queryWeight(author:blow^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.3715843 = (MATCH) fieldWeight(author:blow in 1), product of:
          1.0 = tf(termFreq(author:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.625 = fieldNorm(field=author, doc=1)
      3.5817415E-5 = (MATCH) weight(description:blow in 1), product of:
        2.4097772E-4 = queryWeight(description:blow), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:blow in 1), product of:
          1.0 = tf(termFreq(description:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)
    0.07163518 = (MATCH) max plus 0.01 times others of:
      0.07163482 = (MATCH) weight(title:book^1000.0 in 1), product of:
        0.2409777 = queryWeight(title:book^1000.0), product of:
          1000.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(title:book in 1), product of:
          1.0 = tf(termFreq(title:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=title, doc=1)
      3.5817415E-5 = (MATCH) weight(description:book in 1), product of:
        2.4097772E-4 = queryWeight(description:book), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:book in 1), product of:
          1.0 = tf(termFreq(description:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)

    
0.073365316 = (MATCH) sum of:
  0.073365316 = (MATCH) sum of:
    7.1670645E-4 = (MATCH) max plus 0.01 times others of:
      7.163483E-4 = (MATCH) weight(author:joe^10.0 in 0), product of:
        0.0024097771 = queryWeight(author:joe^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(author:joe in 0), product of:
          1.0 = tf(termFreq(author:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=author, doc=0)
      3.5817415E-5 = (MATCH) weight(description:joe in 0), product of:
        2.4097772E-4 = queryWeight(description:joe), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:joe in 0), product of:
          1.0 = tf(termFreq(description:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)
    0.0010134276 = (MATCH) max plus 0.01 times others of:
      0.0010130694 = (MATCH) weight(author:blow^10.0 in 0), product of:
        0.0024097771 = queryWeight(author:blow^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.42039964 = (MATCH) fieldWeight(author:blow in 0), product of:
          1.4142135 = tf(termFreq(author:blow)=2)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=author, doc=0)
      3.5817415E-5 = (MATCH) weight(description:blow in 0), product of:
        2.4097772E-4 = queryWeight(description:blow), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:blow in 0), product of:
          1.0 = tf(termFreq(description:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)
    0.07163518 = (MATCH) max plus 0.01 times others of:
      0.07163482 = (MATCH) weight(title:book^1000.0 in 0), product of:
        0.2409777 = queryWeight(title:book^1000.0), product of:
          1000.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(title:book in 0), product of:
          1.0 = tf(termFreq(title:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=title, doc=0)
      3.5817415E-5 = (MATCH) weight(description:book in 0), product of:
        2.4097772E-4 = queryWeight(description:book), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0532142E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:book in 0), product of:
          1.0 = tf(termFreq(description:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)

First document

Second document

What can we say about that ?

As you can see, when we passed the 0.01 value to the tie parameter, only those fields that have the highest score for the given query word are most influential. Good example of that behavior is the book word in the first document on the results list. Score for that word is 0.07163518, which was calculated as the sum of the highest scored field (the title field) and the sum of the rest of the fields multiplied by tie.

Tie == 0.99 result

The second query sent to Solr looked as follows:

defType=dismax&qf=title^1000 description author^10&tie=0.99&fl=id,score&debugQuery=on&indent=true&q=joe blow book

Which resulted in the following Solr results: (visualization – http://explain.solr.pl/explains/1w7b06lv):




  0
  15
  
    id,score
    on
    true
    0.99
    joe blow book
    title^1000 description author^10
    dismax
  


  
    0.07352995
    2
  
  
    0.0734685
    1
  


  joe blow book
  joe blow book
  +((DisjunctionMaxQuery((author:joe^10.0 | title:joe^1000.0 | description:joe)~0.99) DisjunctionMaxQuery((author:blow^10.0 | title:blow^1000.0 | description:blow)~0.99) DisjunctionMaxQuery((author:book^10.0 | title:book^1000.0 | description:book)~0.99))~3) ()
  +(((author:joe^10.0 | title:joe^1000.0 | description:joe)~0.99 (author:blow^10.0 | title:blow^1000.0 | description:blow)~0.99 (author:book^10.0 | title:book^1000.0 | description:book)~0.99)~3) ()
  
    
0.07352995 = (MATCH) sum of:
  0.07352995 = (MATCH) sum of:
    9.308678E-4 = (MATCH) max plus 0.99 times others of:
      8.9540955E-4 = (MATCH) weight(author:joe^10.0 in 1), product of:
        0.0024097078 = queryWeight(author:joe^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.3715843 = (MATCH) fieldWeight(author:joe in 1), product of:
          1.0 = tf(termFreq(author:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.625 = fieldNorm(field=author, doc=1)
      3.581638E-5 = (MATCH) weight(description:joe in 1), product of:
        2.4097077E-4 = queryWeight(description:joe), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:joe in 1), product of:
          1.0 = tf(termFreq(description:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)
    9.308678E-4 = (MATCH) max plus 0.99 times others of:
      8.9540955E-4 = (MATCH) weight(author:blow^10.0 in 1), product of:
        0.0024097078 = queryWeight(author:blow^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.3715843 = (MATCH) fieldWeight(author:blow in 1), product of:
          1.0 = tf(termFreq(author:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.625 = fieldNorm(field=author, doc=1)
      3.581638E-5 = (MATCH) weight(description:blow in 1), product of:
        2.4097077E-4 = queryWeight(description:blow), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:blow in 1), product of:
          1.0 = tf(termFreq(description:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)
    0.071668215 = (MATCH) max plus 0.99 times others of:
      0.07163276 = (MATCH) weight(title:book^1000.0 in 1), product of:
        0.24097076 = queryWeight(title:book^1000.0), product of:
          1000.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(title:book in 1), product of:
          1.0 = tf(termFreq(title:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=title, doc=1)
      3.581638E-5 = (MATCH) weight(description:book in 1), product of:
        2.4097077E-4 = queryWeight(description:book), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:book in 1), product of:
          1.0 = tf(termFreq(description:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=1)

    
0.0734685 = (MATCH) sum of:
  0.0734685 = (MATCH) sum of:
    7.517859E-4 = (MATCH) max plus 0.99 times others of:
      7.1632763E-4 = (MATCH) weight(author:joe^10.0 in 0), product of:
        0.0024097078 = queryWeight(author:joe^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(author:joe in 0), product of:
          1.0 = tf(termFreq(author:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=author, doc=0)
      3.581638E-5 = (MATCH) weight(description:joe in 0), product of:
        2.4097077E-4 = queryWeight(description:joe), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:joe in 0), product of:
          1.0 = tf(termFreq(description:joe)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)
    0.0010484984 = (MATCH) max plus 0.99 times others of:
      0.0010130403 = (MATCH) weight(author:blow^10.0 in 0), product of:
        0.0024097078 = queryWeight(author:blow^10.0), product of:
          10.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.42039964 = (MATCH) fieldWeight(author:blow in 0), product of:
          1.4142135 = tf(termFreq(author:blow)=2)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=author, doc=0)
      3.581638E-5 = (MATCH) weight(description:blow in 0), product of:
        2.4097077E-4 = queryWeight(description:blow), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:blow in 0), product of:
          1.0 = tf(termFreq(description:blow)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)
    0.071668215 = (MATCH) max plus 0.99 times others of:
      0.07163276 = (MATCH) weight(title:book^1000.0 in 0), product of:
        0.24097076 = queryWeight(title:book^1000.0), product of:
          1000.0 = boost
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.29726744 = (MATCH) fieldWeight(title:book in 0), product of:
          1.0 = tf(termFreq(title:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.5 = fieldNorm(field=title, doc=0)
      3.581638E-5 = (MATCH) weight(description:book in 0), product of:
        2.4097077E-4 = queryWeight(description:book), product of:
          0.5945349 = idf(docFreq=2, maxDocs=2)
          4.0530972E-4 = queryNorm
        0.14863372 = (MATCH) fieldWeight(description:book in 0), product of:
          1.0 = tf(termFreq(description:book)=1)
          0.5945349 = idf(docFreq=2, maxDocs=2)
          0.25 = fieldNorm(field=description, doc=0)

First document

Second document

What can we say about that ?

As you can see the score of the result documents changed. Let’s have a look at the same document and the same book word. In the case, we sent 0.99 as the value of the tie parameter and the score value of that word increased comparing to the score when using tie of 0.01. Of course, the change is not only because the tie parameter but also because of normalization, but let’s forget about it for things to be simple So, in the second case, we see the score of 0.071668215, which is the score of the title field summed with the score of the other fields multiplied by 0.99 (tie parameter value).

To sum up

As you can see, the tie parameter allows us, to control how the score is calculated for the DisjunctionMaxQuery. In extreme cases, when we only want the highest scored fields to contribute to the total score we can set the tie parameter to 0.0. Tie lets us control, how we want the low scoring fields to be treated, when score of the documents is calculated and thus where they are on the results list we get from Solr when using Dismax query parser.

In case you are wodering

In case you are wondering what we used to show you the diagrams – please go to http://explain.solr.pl/help and see, maybe http://explain.solr.pl/ may be helpful in Your case.

Solr and PhraseQuery – phrase bonus in query stage

Rafał Kuć — Wed, 14 Jul 2010 09:19:38 +0000

In the majority of system implementations I dealt with, sooner or later, there was a problem – search results tunning. One of the simplest ways to improve the search results quality was phrase boosting. Having the three most popular query parsers in Solr and the variety of parameters to control them I though it will be a good idea to check how they behave and how they affect performance.

In the current trunk of Solr we have three query parsers:

Standard Solr Query Parser – default parser for Solr based on Lucene query parser
DisMax Query Parser
Extended DisMax Query Parser

Each of the mentioned query parsers have it`s own capabilities in case of phrase boosting on query stage. I won`t mention index time term proximity in this post – I`ll get back to it some other time. So, about the parsers now.

Standard Solr Query Parser

Parser based on Standard Lucene Query Parser and enhancing it`s parent capabilities. When it comes to phrase boosting, we don`t have much choice. Lets say, that our system is a search system for large Internet library, where users can rate books, leave comments and discuss books in the library forums. Our goal is to index all the data generated by the users and our suppliers and then represent this data in our search results. When user search for “Java design patterns” we want to show him the books that have those words in a document. No problem, lets make a Solr query like this:

q=java+design+patterns

So we get the results and we can say that our search engine is behaving well and we don`t want to improve search quality. But I would add another part to the query – part that would favor document which have a phrase (words given to the query are next to each other in the document) in the search-able fields. It`s an easy step, our modified query would look like this:
q=java+design+patterns+OR+"java+design+patterns"^30

By adding that additional query part (+OR+”java+design+patterns”^30) we modified our search results – by adding that part, on the first position in our result we now have books which have the exact phrase in the search fields. Lucene query generated by the parser look like that:

name:java name:design name:patterns PhraseQuery(name:"java design patterns"^30.0)
name:java name:design name:patterns name:"java design patterns"^30.0

Search results for above query as follows:




   0
   0
   
      java design patterns OR "java design patterns"^30
      score,id,name
   


   
      1.2399161
      1
      Java design patterns
   
   
      0.010219089
      2
      Design patterns java
   
   
      0.010219089
      3
      Design java patterns
   
   
      0.010219089
      4
      Patterns design java
   
   
      0.010219089
      5
      Patterns java design

DisMax Query Parser

In addition to constructing queries in such a manner as described above, we can use the parameter pf and modify its behavior by using the ps parameter. Pf parameter provide information about the fields in which phrases will be identified. Pf parameter is often used in a manner analogous to the parameter qf specifying a list of search-able fields. In addition to that, we must specify the boost parameter for the phrase otherwise the default boost will be taken into consideration. The query using DisMax would look like that:

q=java+design+patterns&defType=dismax&qf=name&pf=name^30&ps=0

While the query passed to Lucene looks as follows:

+((DisjunctionMaxQuery((name:java)) DisjunctionMaxQuery((name:design)) DisjunctionMaxQuery((name:patterns)))~3) DisjunctionMaxQuery((name:"java design patterns"^30.0))
+(((name:java) (name:design) (name:patterns))~3) (name:"java design patterns"^30.0)

The results for the query thus constructed are as follows:




   0
   0
   
      name^30
      id,name,score
      java design patterns
      name
      dismax
      0
   


   
      1.2399161
      1
      Java design patterns
   
   
      0.013625451
      2
      Design patterns java
   
   
      0.013625451
      3
      Design java patterns
   
   
      0.013625451
      4
      Patterns design java
   
   
      0.013625451
      5
      Patterns java design

It is noteworthy that the order of results for both methods is the same. This follows from the fact, that the phrase has been identified only in the document with the id of 1.Look that there is no difference in the value of score for the first document in both methods. Of course the other documents, located on positions from 2 to 5, are in both cases on the same positions, but have different score values because of the difference in query passed to Lucene.

But, I used the ps parameter (set to 0) and didn`t mention why I did it. When You use the pf (and pf2, but more on that later) parameter, the ps parameter mean Phrase Slop – a maximum distance of words from each other to form a phrase. For instance, ps=2 will mean that the words can be a maximum of two places from each other to form a phrase. Note, however, that despite the fact that both the “Java sample design patterns” and “Java design patterns” will create a phrase, but the document entitled “Java design patterns” will have a bigger score value, despite the settings ps=2, because of terms located closer together.

Extended DisMax Query Parser

Unfortunately without the use of trunk You can not use eDisMax. But, anyway, the query using eDisMax Enhanced Term Proximity Boosting would look like that:

q=java+design+patterns&defType=edismax&qf=name&pf2=name^30&ps=0

The above query creates the following query to Lucene:

+(DisjunctionMaxQuery((name:java)) DisjunctionMaxQuery((name:design)) DisjunctionMaxQuery((name:patterns))) (DisjunctionMaxQuery((name:"java design"^30.0)) DisjunctionMaxQuery((name:"design patterns"^30.0)))
+((name:java) (name:design) (name:patterns)) ((name:"java design"^30.0) (name:"design patterns"^30.0))

As seen, in addition to the standard DisjunctionMaxQuery produced by DisMax (and this its expanded version), extended DisMax parser also produced two additional queries – the ones responsible for enhanced term proximity boosting. The additional queries boosts pair of word created from the terms in the user query. In the presented case the created test pairs were “java design” and “design patterns”. As you can guess the most significant documents in the results list, documents will be generated by having both pairs, the next document will have one of the pair, and another will not have any. As proof I present the result of the above query send to Solr:




   0
   0
   
      id,name,score
      java design patterns
      name
      name^30
      edismax
      0
   


   
      1.1705827
      1
      Java design patterns
   
   
      0.3034844
      2
      Design patterns java
   
   
      0.3034844
      5
      Patterns java design
   
   
      0.014451639
      3
      Design java patterns
   
   
      0.014451639
      4
      Patterns design java

As you can see the first document has not changed its position. The second and third place are the documents that have one of the pairs generated by the parser. As a result documents with id 2 and 5 have the same coefficient score value. The result list is closed by the documents with only terms present in the search-able fields.

Performance

In any case, it must be taken into account that individual features will affect the performance of applications based on Solr. I thought I`ll do a simple performance test. The assumptions of the test are quite simple – index data from wikipedia and for each phrase boost method create five queries – each of the queries assembled from two to six tokens. Solr cache disabled, restart of Solr after each query. The result is the arithmetic mean of 10 repetitions of each test. Before the test results, a few words about the index:

Number of documents in the index: 1,177,239
Number of segments: 1
Number of terms: 18.506.646
Number of term/document pairs: 230.297.212
Number of tokens: 418.135.268
The size of the index: 4.6GB (optimized)
Lucene version used to build the index: 4.0-dev 964000

Phrases that were selected for each iteration of the test:

Iteration I: “Great Peter”
Iteration II: “World War Two”
Iteration III: “World War Two Germany”
Iteration IV: “Move Time Eastern Poland Reformation”
Iteration V: “Change Winter Cloths To Summer Cloths Now”

The results were as follows:

[table “1” not found /]

Please note that the reported results concern only the issue of performance and are not suggesting a method of phrase boosting. The choice of method is a matter of requirements and implementation. As for the results, you can see that the DisMax method is the quickest one.