Optimization – document cache

A few months ago (here) we looked at filterCache. I’ve decided to update the optimization topic and take a look at the documentCache.

What it contains ?

So let’s start with information about the information that documentCache holds. So documentCache contain Lucene documents that were fetched from the index. So little and so much.

What it is used for ?

Every object (Lucene document) stored in documentCache contains a list of references to the fields, that are stored with the document. Thanks to this, when a document is fetched and put into the cache it doesn’t have to be fetched again while processing another query. And this is why the number of I/O operations is reduces when rendering the query results list.

What to remember when using documentCache ?

When using documentCache you have to remember about to important things:

  1. documentCache can’t be autowarmed because it operates on identifiers that change after every commit operation.
  2. If you use lazy field loading (enableLazyFieldLoading=true) documentCache functionality is somehow limited. This means that the document stored in the documentCache will contain only those fields that were passed to the fl parameter. If the next query will try to get additional fields for the document stored in the cache, those additional fields will be fetched from the index.


The standard documentCache definition looks like this:


Let’s recall those parameters:

  • class – class implementing the cache,
  • size – the maximum cache size,
  • initialSize – initial size of the cache.

How to configure ?

The usual question about cache – what size should I set ? According to the information from Solr wiki (http://wiki.apache.org/solr/SolrCaching#documentCache), the maximum size shouldn’t be less than the product of concurrent queries and the maximum number of documents fetched by the query. A simple relation that should ensure that Solr won’t have to fetch documents from the index during query processing.

Last few words

In the case of documentCache we don’t have to worry about how we construct our queries to properly use this cache. But please remember that documentCache requires memory, the more memory, the more field you stored in the index.

This post is also available in: Polish

This entry was posted on Monday, August 29th, 2011 at 08:13 and is filed under About Solr. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to “Optimization – document cache”

  1. Mohit Says:

    A list of threshold value for different caching options will be great. Can you please provide some list here?

  2. gr0 Says:

    I’m afraid it’s not that easy to provide some general information about what values document cache size should be set to. I would recommend observing how you Solr instance behaves – are there any evictions in this cache or not (this can be done even on staging environment during or after performance testing). You can see the actual size of the cache and overall metrics of the cache in Solr administration pages.