{"id":560,"date":"2013-07-08T14:00:43","date_gmt":"2013-07-08T12:00:43","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=560"},"modified":"2020-11-12T14:01:13","modified_gmt":"2020-11-12T13:01:13","slug":"automatically-generate-document-identifiers-solr-4-x","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2013\/07\/08\/automatically-generate-document-identifiers-solr-4-x\/","title":{"rendered":"Automatically Generate Document Identifiers &#8211; Solr 4.x"},"content":{"rendered":"<p>A few days ago I got a question regarding the automatic identifiers of documents in Solr 4.0, because the method from Solr 3 was deprecated. Because of that we decided to write a quick post about how to use Solr to generate documents unique identifier in Solr 4.x.<\/p>\n\n\n<!--more-->\n\n\n<h3>Data structure<\/h3>\n<p>Our simple data structure (<em>fields<\/em> section of the <em>schema.xml<\/em> file) looks as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;fields&gt;\n &lt;field name=\"id\" type=\"string\" indexed=\"true\" stored=\"true\" required=\"true\" multiValued=\"false\" \/&gt;\n &lt;field name=\"name\" type=\"text_general\" indexed=\"true\" stored=\"true\"\/&gt;\n &lt;field name=\"_version_\" type=\"long\" indexed=\"true\" stored=\"true\"\/&gt;\n&lt;\/fields&gt;<\/pre>\n<p>In addition to that we&#8217;ve added the information about which field is the one that should contain unique identifiers. This was also done in <em>schema.xml<\/em> file:\n<\/p>\n<pre class=\"brush:xml\">&lt;uniqueKey&gt;id&lt;\/uniqueKey&gt;<\/pre>\n<h3>Solr configuration<\/h3>\n<p>In addition to changes in the <em>schema.xml<\/em> file, we need to modify the <em>solrconfig.xml <\/em>file and introduce a proper <em>UpdateRequestProcessorChain<\/em>, like the following one:\n<\/p>\n<pre class=\"brush:xml\">&lt;updateRequestProcessorChain&gt;\n &lt;processor class=\"solr.UUIDUpdateProcessorFactory\"&gt;\n  &lt;str name=\"fieldName\"&gt;id&lt;\/str&gt;\n &lt;\/processor&gt;\n &lt;processor class=\"solr.LogUpdateProcessorFactory\" \/&gt;\n &lt;processor class=\"solr.RunUpdateProcessorFactory\" \/&gt;\n&lt;\/updateRequestProcessorChain&gt;<\/pre>\n<p>By doing this we inform Solr that we want the <em>id<\/em> field contents to be automatically generated.<\/p>\n<h3>A simple test<\/h3>\n<p>Let&#8217;s test what we did. In order to do that we will index a simple document by using the following command:\n<\/p>\n<pre class=\"brush:bash\">curl -XPOST 'localhost:8983\/solr\/update?commit=true' --data-binary '&lt;add&gt;&lt;doc&gt;&lt;field name=\"name\"&gt;Test&lt;\/field&gt;&lt;\/doc&gt;&lt;\/add&gt;' -H 'Content-type:application\/xml'<\/pre>\n<p>If everything went well the above document was indexed. In order to check what happened we will send a simple query and look at the results. In order to do that we use the following comand:\n<\/p>\n<pre class=\"brush:bash\">curl -XGET 'localhost:8983\/solr\/select?q=*:*&amp;indent=true'<\/pre>\n<p>The result returned by Solr for the above command is as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n &lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;0&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n   &lt;str name=\"indent\"&gt;true&lt;\/str&gt;\n   &lt;str name=\"q\"&gt;*:*&lt;\/str&gt;\n  &lt;\/lst&gt;\n &lt;\/lst&gt;\n &lt;result name=\"response\" numFound=\"1\" start=\"0\"&gt;\n  &lt;doc&gt;\n   &lt;str name=\"name\"&gt;Test&lt;\/str&gt;\n   &lt;str name=\"id\"&gt;1cdee8b4-c42d-4101-8301-4dc350a4d522&lt;\/str&gt;\n   &lt;long name=\"_version_\"&gt;1439726523307261952&lt;\/long&gt;\n  &lt;\/doc&gt;\n &lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As we can see the unique identifier was automatically generated. Now if we would send the same indexing command once again:\n<\/p>\n<pre class=\"brush:bash\">curl -XPOST 'localhost:8983\/solr\/update?commit=true' --data-binary '&lt;add&gt;&lt;doc&gt;&lt;field name=\"name\"&gt;Test&lt;\/field&gt;&lt;\/doc&gt;&lt;\/add&gt;' -H 'Content-type:application\/xml'<\/pre>\n<p>And run the same query again:\n<\/p>\n<pre>curl -XGET 'localhost:8983\/solr\/select?q=*:*&amp;indent=true'<\/pre>\n<p>We would get two documents in results, just like the following:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n &lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;1&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n   &lt;str name=\"indent\"&gt;true&lt;\/str&gt;\n   &lt;str name=\"q\"&gt;*:*&lt;\/str&gt;\n  &lt;\/lst&gt;\n &lt;\/lst&gt;\n &lt;result name=\"response\" numFound=\"2\" start=\"0\"&gt;\n  &lt;doc&gt;\n   &lt;str name=\"name\"&gt;Test&lt;\/str&gt;\n   &lt;str name=\"id\"&gt;1cdee8b4-c42d-4101-8301-4dc350a4d522&lt;\/str&gt;\n   &lt;long name=\"_version_\"&gt;1439726523307261952&lt;\/long&gt;\n  &lt;\/doc&gt;\n  &lt;doc&gt;\n   &lt;str name=\"name\"&gt;Test&lt;\/str&gt;\n   &lt;str name=\"id\"&gt;9bedcb5f-1b71-4ab7-80a9-9882a6bf319e&lt;\/str&gt;\n   &lt;long name=\"_version_\"&gt;1439726693819351040&lt;\/long&gt;\n  &lt;\/doc&gt;\n &lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As you can see, the two above documents have different unique identifiers, so the functionality works.<\/p>","protected":false},"excerpt":{"rendered":"<p>A few days ago I got a question regarding the automatic identifiers of documents in Solr 4.0, because the method from Solr 3 was deprecated. Because of that we decided to write a quick post about how to use Solr<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[521,520,164],"class_list":["post-560","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-automatic","tag-identifier","tag-solr-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/560","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=560"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/560\/revisions"}],"predecessor-version":[{"id":561,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/560\/revisions\/561"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=560"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=560"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=560"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}