<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>handler &#8211; Solr.pl</title>
	<atom:link href="https://solr.pl/en/tag/handler/feed/" rel="self" type="application/rss+xml" />
	<link>https://solr.pl/en/</link>
	<description>All things to be found - Blog related to Apache Solr &#38; Lucene projects - https://solr.apache.org</description>
	<lastBuildDate>Wed, 11 Nov 2020 22:51:42 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>Backing Up Your Index</title>
		<link>https://solr.pl/en/2012/08/13/backing-up-your-index/</link>
					<comments>https://solr.pl/en/2012/08/13/backing-up-your-index/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 13 Aug 2012 21:51:12 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[backup]]></category>
		<category><![CDATA[handler]]></category>
		<category><![CDATA[index]]></category>
		<category><![CDATA[replication]]></category>
		<guid isPermaLink="false">http://sematext.solr.pl/?p=472</guid>

					<description><![CDATA[Did you ever wonder if you can create a backup of your index with the tools available in Solr ? For exmaple after every commit or optimize operation ? Or may you would like to create backups with the HTTP]]></description>
										<content:encoded><![CDATA[<p>Did you ever wonder if you can create a backup of your index with the tools available in Solr ? For exmaple after every <em>commit</em> or <em>optimize</em> operation ? Or may you would like to create backups with the HTTP API call ? Lets see what possibilities Solr has to offer.</p>
<p><span id="more-472"></span></p>
<h3>The Beginning</h3>
<p>We decided to write about index backups even though this functionality is fairly simple. We noticed that many people tend to forget about this functionality, not only when it comes to Apache Solr. We hope that this blog entry, will help you remember about backup creation functionality, when you need it. But now, lets start from the beginning &#8211; before we started the tests, we looked at the directory where Solr keeps its indices and this is what we saw:
</p>
<pre class="brush:bash">drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:17 index
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:16 spellchecker</pre>
<h3>Manual Backup</h3>
<p>In order to create a backup of your index with the use of HTTP API you have to have replication handler configured. If you have it, then you need to send the <em>command</em> parameter with <em>backup</em> value to the master server replication handler, for example like this:
</p>
<pre class="brush:xml">curl 'http://localhost:8983/solr/replication?command=backup'</pre>
<p>The above will tell Solr to create a new backup of the current index. Lets now look how the directory where indices live looks like after running the above command:
</p>
<pre class="brush:bash">drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:18 index
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:19 snapshot.20120812201917
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:16 spellchecker</pre>
<p>As you can see, there is a new directory created &#8211; <em>snapshot.20120812201917</em>. We can assume, that we got what we wanted <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3>Automatic Backup</h3>
<p>In addition to manual backup creation, you can also configure Solr to create indices after <em>commit</em> or <em>optimize</em> operation. Please remember though, that if your index is changing rapidly it is usually a bad idea to create backup after each&nbsp;commit operation.&nbsp; But lets get back to automatic backups. In order to configure Solr to create backups for us, you need to add the following line to replication handler configuration:
</p>
<pre class="brush:xml">&lt;str name="backupAfter"&gt;commit&lt;/str&gt;</pre>
<p>So, the full replication handler configuration (on the <em>master</em> server) would look like this:
</p>
<pre class="brush:xml">&lt;requestHandler name="/replication" &gt;
 &lt;lst name="master"&gt;
  &lt;str name="replicateAfter"&gt;commit&lt;/str&gt;
  &lt;str name="replicateAfter"&gt;startup&lt;/str&gt;
  &lt;str name="confFiles"&gt;schema.xml,stopwords.txt&lt;/str&gt;
  &lt;str name="backupAfter"&gt;commit&lt;/str&gt;
 &lt;/lst&gt;
&lt;/requestHandler&gt;</pre>
<p>After sending two <em>commit</em> operation our dictionary with indices looks like this:
</p>
<pre class="brush:bash">drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 21:12 index
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 21:12 snapshot.20120812211203
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 21:12 snapshot.20120812211216
drwxrwxr-x 2 gr0 gr0 4096 2012-08-12 20:16 spellchecker</pre>
<p>As you can see, Solr did what we wanted to be done.</p>
<h3>Keeping Order</h3>
<p>It is possible to control the maximum amount of backups that should be stored on disk. In order to configure that number you need to add the following line to your replication handler configuration:
</p>
<pre class="brush:xml">&lt;str name="maxNumberOfBackups"&gt;10&lt;/str&gt;</pre>
<p>The above configuration value tells Solr to keep maximum of ten backups of your index. Of course you can delete created backups (manually for example) if you don&#8217;t need them anymore.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2012/08/13/backing-up-your-index/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Solr 3.1: JSON Update Handler</title>
		<link>https://solr.pl/en/2011/04/18/solr-3-1-json-update-handler/</link>
					<comments>https://solr.pl/en/2011/04/18/solr-3-1-json-update-handler/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 18 Apr 2011 18:42:09 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[file format]]></category>
		<category><![CDATA[handler]]></category>
		<category><![CDATA[json]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[update]]></category>
		<category><![CDATA[update handler]]></category>
		<guid isPermaLink="false">http://sematext.solr.pl/?p=260</guid>

					<description><![CDATA[After the release of Solr 3.1 I decided to look into the extended list of formats through which we can update the indexes. Until now we had a choice of three kinds of formats with which we were able to]]></description>
										<content:encoded><![CDATA[<p>After the release of <a href="http://solr.pl/en/2011/03/31/lucene-and-solr-3-1/" target="_blank" rel="noopener noreferrer">Solr 3.1</a> I decided to look into the extended list of formats through which we  can update the indexes. Until now we had a choice of three kinds of  formats with which we were able to provide data &#8211; XML, CSV, and so.  called JavaBin. The release of Solr 3.1 introduces the fourth format &#8211;  JSON.</p>
<p><span id="more-260"></span></p>
<h3>Let&#8217;s start</h3>
<p><em></em>The  new handler (<em>JsonUpdateRequestHandler</em>) allows us to transfer data in  the JSON format which in theory should translate into a smaller amount  of data sent over the network and the speedup of indexing, as the JSON  parser is theoretically faster than XML parsers. But let&#8217;s leave the  performance for now.</p>
<h3>Configuration</h3>
<p>Let&#8217;s start by defining a handler. To  do that add the following definition to the <em>solrconfig.xml </em>file (if you  use the default solrconfig.xml file provided with Solr 3.1 than this  handler is already defined):
</p>
<pre class="brush:xml">&lt;requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler" startup="lazy" /&gt;</pre>
<p>The entry above defines a new handler that will be initialized when used for the first time (<em>startup=&#8221;lazy&#8221;</em>).</p>
<h3>Indexing</h3>
<p>The next step is to prepare the data &#8211; of course in JSON format. Here&#8217;s an example showing two documents in one file called <em>data.json</em>:
</p>
<pre class="brush:plain">{

"add": {
  "doc": {
    "id" : "123456788",
    "region" : ["abc","def"],
    "name" : "ABCDEF",
  }
}

,
"add": {
  "doc": {
    "id" : "123456789",
    "region" : ["abc","def"],
    "name" : "XYZMN",
  }
}

}</pre>
<p>Such prepared file can be sent to the <em>/update/json</em> address and thus be indexed. Remember  to send a commit command to the appropriate address (standard <em>/update</em>)  in order to tell Solr to open a new index searcher.</p>
<h3>Performance</h3>
<p>At the end I left myself what I&#8217;m really most interested in &#8211; the performance of the new handler. According  to the information stored in JIRA system we can be expect that <em> JsonUpdateRequestHandler </em>will be faster than its counterpart processor  of XML format. To examine this, I prepared the files of 10.000, 100.000  and 1 million documents. Every document contained an identifier (string field), two  regions (String field, multivalued) and the name (text field). One  file was saved in the JSON format, the second one was saved in XML  format, the third one was saved in CSV format. All files were then  indexed separately. Here is an outcome of this simple test:</p>
[table “10” not found /]<br />

<p>The conclusions suggest themselves. First, XML data is relatively larger than the one written in JSON format (the difference is about 35%). However,  a file stored in JSON format, is larger (which might be expected) than  the one written in the CSV. If you send data not on the local network,  the size is relevant &#8211; the difference in file size is significant enough  that it is worth thinking about changing the XML to any of the formats  that require less space.</p>
<h3>Indexation time</h3>
<p>Another thing is the indexing time. Leaning  on the results of this simple test we can think that <em> JsonUpdateRequestHandler </em>is slightly (about 7 &#8211; 9%) faster than the <em> XmlUpdateRequestHandler</em>. As  you can see, the difference is similar for <em>JsonUpdateRequestHandler </em>and  <em>CSVRequestHandler</em>, where the handler operates on files in CSV format is  faster than its counterpart that operates in JSON format by about 7 to  9%. Let&#8217;s  hope that when the <a href="http://labs.apache.org/labs.html" target="_blank" rel="noopener noreferrer">noggit </a>library comes out of Apache Labs, its  performance will be even greater, and thus we will see even faster <em> JsonUpdateRequestHandler</em>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2011/04/18/solr-3-1-json-update-handler/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
