{"id":215,"date":"2011-02-21T09:17:08","date_gmt":"2011-02-21T08:17:08","guid":{"rendered":"http:\/\/sematext.solr.pl\/?p=215"},"modified":"2020-11-11T09:17:41","modified_gmt":"2020-11-11T08:17:41","slug":"waiting-for-4-0-solr-2272-solr-and-join-functionality","status":"publish","type":"post","link":"https:\/\/solr.pl\/en\/2011\/02\/21\/waiting-for-4-0-solr-2272-solr-and-join-functionality\/","title":{"rendered":"Waiting for 4.0: SOLR-2272 &#8211; Solr and JOIN functionality"},"content":{"rendered":"<p>Recently,  my attention caught the functionality described in the <a href=\"https:\/\/issues.apache.org\/jira\/browse\/SOLR-2272\" target=\"_blank\" rel=\"noopener noreferrer\">SOLR-2272<\/a> Jira  ticket &#8211; functionality of SQL JOIN implemented in Solr. In today&#8217;s entry I take a look at this functionality.<\/p>\n\n\n<!--more-->\n\n\n<h3>Preparation<\/h3>\n<p>The  first thing we need to do is download the code from the main branch  (<em>trunk<\/em>) of Lucene\/Solr SVN repository and add the  the patch attached to  the SOLR-2272. After building Solr we can proceed to testing.<\/p>\n<h3>Index structure<\/h3>\n<p>For testing purposes, I decided to define the following index structure:\n<\/p>\n<pre class=\"brush:xml\">&lt;fields&gt;\n  &lt;field name=\"id\" type=\"string\" indexed=\"true\" stored=\"true\" required=\"true\" \/&gt;\n  &lt;field name=\"name\" type=\"text\" indexed=\"true\" stored=\"true\"\/&gt;\n  &lt;field name=\"color\" type=\"string\" indexed=\"true\" stored=\"true\"\/&gt;\n  &lt;field name=\"parent\" type=\"string\" indexed=\"true\" stored=\"true\"\/&gt;\n&lt;\/fields&gt;<\/pre>\n<p>The structure is rather simple. The <em>id<\/em> field is responsible for storing a unique identifier of the document. The  <em>name<\/em> field holds the name of the document, the <em>color<\/em> is the color of  the product, and the <em>parent<\/em> field is the identifier for the parent  document.<\/p>\n<h3>Test data<\/h3>\n<p>File with the sample data is as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;add&gt;\n &lt;doc&gt;\n  &lt;field name=\"id\"&gt;1&lt;\/field&gt;\n  &lt;field name=\"name\"&gt;First document&lt;\/field&gt;\n  &lt;field name=\"color\"&gt;Red&lt;\/field&gt;\n &lt;\/doc&gt;\n &lt;doc&gt;\n  &lt;field name=\"id\"&gt;2&lt;\/field&gt;\n  &lt;field name=\"name\"&gt;Second document&lt;\/field&gt;\n  &lt;field name=\"color\"&gt;Yellow&lt;\/field&gt;\n &lt;\/doc&gt;\n &lt;doc&gt;\n  &lt;field name=\"id\"&gt;3&lt;\/field&gt;\n  &lt;field name=\"name\"&gt;Third document&lt;\/field&gt;\n  &lt;field name=\"color\"&gt;Red&lt;\/field&gt;\n  &lt;field name=\"parent\"&gt;1&lt;\/field&gt;\n &lt;\/doc&gt;\n &lt;doc&gt;\n  &lt;field name=\"id\"&gt;4&lt;\/field&gt;\n  &lt;field name=\"name\"&gt;Fourth document&lt;\/field&gt;\n  &lt;field name=\"color\"&gt;Yellow&lt;\/field&gt;\n  &lt;field name=\"parent\"&gt;1&lt;\/field&gt;\n &lt;\/doc&gt;\n&lt;\/add&gt;<\/pre>\n<h3>Query example<\/h3>\n<p>For the short tests I conducted I&#8217;ve chosen a fairly simple case. I wanted to get documents that are the parents of documents with the value <em>Yellow <\/em>in the <em>color<\/em> field. Nothing  fancy, at least in the meaning of SQL. To execute this query in the  standard Solr we would have to make at least two queries &#8211; first to get  all documents with the appropriate value in the <em>color<\/em> field, and then their  parent documents. And how it looks when talking about patched Solr 4.0 ? Take a look at the following query:\n<\/p>\n<pre class=\"brush:xml\">http:\/\/localhost:8983\/solr\/select?q={!join from=parent to=id}color:Yellow<\/pre>\n<p>Response for the above query looks like this:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;1&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n    &lt;str name=\"q\"&gt;{!join from=parent to=id}color:Yellow&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;result name=\"response\" numFound=\"1\" start=\"0\"&gt;\n  &lt;doc&gt;\n    &lt;str name=\"color\"&gt;Red&lt;\/str&gt;\n    &lt;str name=\"id\"&gt;1&lt;\/str&gt;\n    &lt;str name=\"name\"&gt;First document&lt;\/str&gt;\n  &lt;\/doc&gt;\n&lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As you can see we got the results we wanted. But how does that query work ? As you can see in the example, apart from the standard query for a document with the given value in the field (<em>color:Yellow<\/em>) I&#8217;ve added a strange looking fragment <em>{!join from=parent to=id}<\/em>. This fragment tells Solr to show not the documents with the given field value, but those, which have the value in the <em>id<\/em> field the same as the <em>parent<\/em> field of the document that match the given query.<\/p>\n<h3>Verification<\/h3>\n<p>Let&#8217;s see if the result we got are the right ones. First let&#8217;s fetch the documents that have <em>Yellow<\/em> value in the <em>color<\/em> field. To do that, we send the following query to Solr:\n<\/p>\n<pre class=\"brush:xml\">http:\/\/localhost:8983\/solr\/select?q=color:Yellow<\/pre>\n<p>The response was as follows:\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;2&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n    &lt;str name=\"q\"&gt;color:Yellow&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;result name=\"response\" numFound=\"2\" start=\"0\"&gt;\n  &lt;doc&gt;\n    &lt;str name=\"color\"&gt;Yellow&lt;\/str&gt;\n    &lt;str name=\"id\"&gt;2&lt;\/str&gt;\n    &lt;str name=\"name\"&gt;Second document&lt;\/str&gt;\n  &lt;\/doc&gt;\n  &lt;doc&gt;\n    &lt;str name=\"color\"&gt;Yellow&lt;\/str&gt;\n    &lt;str name=\"id\"&gt;4&lt;\/str&gt;\n    &lt;str name=\"name\"&gt;Fourth document&lt;\/str&gt;\n    &lt;str name=\"parent\"&gt;1&lt;\/str&gt;\n  &lt;\/doc&gt;\n&lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>So we got one document which have a value in the <em>parent<\/em> field &#8211; this value is <em>1<\/em>. Document with <em>id <\/em>field equals to <em>1<\/em> is the following one:<\/p>\n<p><em> <\/em><em> <\/em>\n<\/p>\n<pre class=\"brush:xml\">&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;\n&lt;response&gt;\n&lt;lst name=\"responseHeader\"&gt;\n  &lt;int name=\"status\"&gt;0&lt;\/int&gt;\n  &lt;int name=\"QTime\"&gt;4&lt;\/int&gt;\n  &lt;lst name=\"params\"&gt;\n    &lt;str name=\"indent\"&gt;true&lt;\/str&gt;\n    &lt;str name=\"q\"&gt;id:1&lt;\/str&gt;\n  &lt;\/lst&gt;\n&lt;\/lst&gt;\n&lt;result name=\"response\" numFound=\"1\" start=\"0\"&gt;\n  &lt;doc&gt;\n    &lt;str name=\"color\"&gt;Red&lt;\/str&gt;\n    &lt;str name=\"id\"&gt;1&lt;\/str&gt;\n    &lt;str name=\"name\"&gt;First document&lt;\/str&gt;\n  &lt;\/doc&gt;\n&lt;\/result&gt;\n&lt;\/response&gt;<\/pre>\n<p>As you can see, the JOIN query worked correctly \ud83d\ude09<\/p>\n<h3>A few words at the end<\/h3>\n<p>Keep in mind that at the moment the functionality is under development and may change. The important this is that with Solr version 4.0 we will be able to enjoy the functionality similar to SQL JOIN.<\/p>","protected":false},"excerpt":{"rendered":"<p>Recently, my attention caught the functionality described in the SOLR-2272 Jira ticket &#8211; functionality of SQL JOIN implemented in Solr. In today&#8217;s entry I take a look at this functionality.<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[27],"tags":[305,306,164],"class_list":["post-215","post","type-post","status-publish","format-standard","hentry","category-solr-en","tag-2272-2","tag-join-2","tag-solr-2"],"_links":{"self":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/215","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/comments?post=215"}],"version-history":[{"count":1,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/215\/revisions"}],"predecessor-version":[{"id":216,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/posts\/215\/revisions\/216"}],"wp:attachment":[{"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/media?parent=215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/categories?post=215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solr.pl\/en\/wp-json\/wp\/v2\/tags?post=215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}