Solr 7.0 – Point Type Fields

With the Solr 7.0 being just around the corner (RC2 voting was done this week) we decided to start sharing important changes that are going to be released with the next major Solr version. The first thing that we will mention are Point field types.

New field type

The software is evolving, algorithms change, new features are added and data structures change as well. One of the changes that were introduced between certain versions of Lucene is Point field types. Those field types have been introduced in Lucene 6.x, but Solr had to wait till version 7 to get support for them.

Point type fields

The Point field types are using so called K-D blocks, which during indexing and searching are using recurency to divide the value space into smaller, rectange spaces to allow for faster searching. From Lucene point of view the values are stored and byte arrays. This seems to be a very simple approach, but it is very efficient and it can be seen in various tests done by the committers and companies behind them.

From Solr point of view the configuration is very simple and looks as follows:

<fieldType name="pint" class="solr.IntPointField" docValues="true"/>
<fieldType name="pfloat" class="solr.FloatPointField" docValues="true"/>
<fieldType name="plong" class="solr.LongPointField" docValues="true"/>
<fieldType name="pdouble" class="solr.DoublePointField" docValues="true"/>

<fieldType name="pints" class="solr.IntPointField" docValues="true" multiValued="true"/>
<fieldType name="pfloats" class="solr.FloatPointField" docValues="true" multiValued="true"/>
<fieldType name="plongs" class="solr.LongPointField" docValues="true" multiValued="true"/>
<fieldType name="pdoubles" class="solr.DoublePointField" docValues="true" multiValued="true"/>

One thing that you may be wondering from the above configuration is the doc values definition. When it comes to Point based types and doc values – they are needed if you would like to use functionality that requires FieldCache. The Point types do not support FieldCache and because of that, we are forced to use doc values, which of course it is nothing bad on its own. We just have to remember that when using functions, faceting or sorting on Point based fields we need to turn on doc values.

Trie type fields

In that case, what happens with the Trie based types? Exactly the same what happened to the legacy field types that were once part of Lucene and Solr – they were first deprecated and later removed. In Solr 7.0 the Trie based fields as marked as deprecated, which means that we may expect future Solr versions to remove them (probably in Solr 8.0). Of course, you don’t have to take any rapid moves, but if you plan on re-indexing in the future consider using Point field which should be you a performance boost and prepare you for the future Solr versions.

This post is also available in: Polish

This entry was posted on Monday, September 4th, 2017 at 06:56 and is filed under About Lucene, About Solr, Lucene. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.