Automatically Generate Document Identifiers – Solr 4.x

A few days ago I got a question regarding the automatic identifiers of documents in Solr 4.0, because the method from Solr 3 was deprecated. Because of that we decided to write a quick post about how to use Solr to generate documents unique identifier in Solr 4.x.

Data structure

Our simple data structure (fields section of the schema.xml file) looks as follows:

In addition to that we’ve added the information about which field is the one that should contain unique identifiers. This was also done in schema.xml file:

Solr configuration

In addition to changes in the schema.xml file, we need to modify the solrconfig.xml file and introduce a proper UpdateRequestProcessorChain, like the following one:

By doing this we inform Solr that we want the id field contents to be automatically generated.

A simple test

Let’s test what we did. In order to do that we will index a simple document by using the following command:

If everything went well the above document was indexed. In order to check what happened we will send a simple query and look at the results. In order to do that we use the following comand:

The result returned by Solr for the above command is as follows:

As we can see the unique identifier was automatically generated. Now if we would send the same indexing command once again:

And run the same query again:

We would get two documents in results, just like the following:

As you can see, the two above documents have different unique identifiers, so the functionality works.

