Hierarchical faceting – Pivot facets in trunk

In a large number of implementations which I took part in, sooner or later, the question arise – what can we do to get faceting as a tree structure. Of course there some tricks for that, however, their use was to modify the data and appropriate processing of the results on application side. It was not particularly functional, nor especially comfortable. However, a few days ago Solr version 4.0 has been enhanced with code that is marked as Solr-792 in the system JIRA. Let’s see in this case, how to get the faceting results as a tree.

Important Note – at this point this functionality is only available in version 4.0, Solr, which is the development version. To use this version you need to download the code from trunk of Lucene/Solr SVN repository.

A few words at the beginning

In many projects in which I had the opportunity to deal with there was a need to use a hierarchical faceting. One of the simplest example is the requirement of showing the cities in the provinces and the number of documents in both provinces, as well as in various cities. Till recently, with no changes in the structure of data, it was impossible to achieve such functionality. Now it is possible 😉

Indexing

In order not to unnecessarily complicate the described functionality I decided to use the sample XML documents that are available in the directory /exampledocs of the example deployment. I also didn’t modify the schema.xml file, or solrconfig.xml, so that configurations are standard. So thats all when it comes to configuration. So we can start the indexing process (I called the command from the directory $SOLR_HOME/exampledocs/):

After seeing several screens of information, and we have our data indexed.

The mechanism

It is not difficult to use hierarchical faceting. Solr creators gave us to use two additional parameters to the ones we already know:

  • facet.pivot – list of comma-separated fields, which shows at which fields and in what order to calculate the structure,
  • facet.pivot.mincount – the minimum number of documents there needs to be to the result to be included in faceting results. The default value is 1.

So let’s try it.

Queries

At the beginning of the try with two fields. I query for all the documents from the index and add the parameter facet.pivot=cat,inStock to say Solr that I want to get the results of the hierarchical faceting, where the first level of the hierarchy is the cat field, and the second level is the inStock field. The query looks as follows:

To shorten the listing I omitted the part responsible for the search results along with a header.

The presentation of faceting results has changed in this case. For each of the main level we have the markers defining the field (the tag with the attribute name=”field”), value (the tag with the attribute name=”value”) and the number of documents (the tag with the attribute name=”count”). Next there is the the second level hierarchy (tag with the attribute name=”pivot”). The second level contains the same elements as the first level – name, value and the number of documents with a given value.

Let’s see how this mechanism can deal with more levels of depth. To check that I run the following query:

I omitted the response header with the results, leaving the faceting results only. In addition, due to the length of the faceting results I only show one level one level faceting:

As shown in the example, also in this case Solr had no problems with the correct calculation of the hierarchy. The above example is almost the same, in the context of data available, as the previous example, it only contains one more level of depth.

A few words at the end

In my opinion this is one of the more useful features for “ordinary” user. Unfortunately, so far only available in development version of Solr. I have not found any information about whether it is planned to transfer this functionality to version 1.5 of Solr, which is named branch_3x branch in SVN. However, it is important that this functionality was commited, and sooner or later Solr users will be able to use it.

11 thoughts on “Hierarchical faceting – Pivot facets in trunk

  • 23 November 2010 at 11:37
    Permalink

    Is this code for SOLR-792 already available in the trunk of Solr 4.0?

    Reply
    • 23 November 2010 at 11:57
      Permalink

      Yes, the code is already commited to trunk – I was writing this post with the help of trunk version of Solr. There were also same changes to the functionality since the post were published.

      Reply
  • 16 December 2010 at 16:10
    Permalink

    I have solr 1.4.1 and i am trying pivots as given above without changing any configuration, but i dont see any results for pivot. Any clue?

    Reply
  • 18 December 2010 at 13:16
    Permalink

    Vijay, as I wrote in the beginning of the post, pivot facet are not available in Solr 1.4.1 – you need to be using trunk version of Solr from the SVN repository.

    Reply
  • 15 February 2011 at 04:11
    Permalink

    I run a normal facet query with q parameter q=*:* and did facet=on&facet.field=stock&facet.filed=place&facet.field=quantity&facet.mincout=1

    Results i got is-

    10
    10
    10
    10

    10
    10

    10
    10

    Now when I am doing this facet.pivot query with same q paramater (q= *:* )and same data set ..
    query – facet.pivot=stock,place,quality&facet.mincout=1

    Result I get is like this-

    The point is .. Why I am not getting result hirearchy for “wheat” when it is coming in the flat faceting above.

    Reply
  • 15 February 2011 at 04:14
    Permalink

    Sorry above post dint include my tag solr results. Don’t know why!
    10 had tags attached.
    Let me repost

    Reply
  • 15 February 2011 at 04:18
    Permalink

    10
    10
    10
    10

    10
    10

    10
    10

    Reply
  • 15 February 2011 at 14:58
    Permalink

    What Solr version are you using ? Remeber that this feature requires Solr 4.0, so you need to get that version from the SVN repository.

    Reply
  • 19 February 2011 at 05:46
    Permalink

    In your opinion, do you think it would be safe to base any software on this at this point? I have a project I’d love to bring to solr that’s being redone in the next 3 months. This feature would be critical to making it possible (without a ton of unnecessary extra facet queries, which would be performance death). I’m really tempted but if it’s only in the dev branch right now… it’s so hard to get a sense of when that means it will be production-ready! I can deal with it right now but is it worth my time when I need it to be released before May…?

    Reply
  • 19 February 2011 at 15:10
    Permalink

    In my opinion the 4.0 version won’t be released before May. I personally didn’t use 4.0 in production environment. But, from my personal experience, we didn’t have any problems when going into production on so called “dev” branches, actually almost all projects I’ve deal with was run on development branches. It’s actually up to you, but if you are sure, you can test functionalities before going live and pivot facets are a crucial functionality I would consider taking the 4.0 version. Just remember that when upgrading between versions of Solr 4.0 there may be a need of full indexation.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.