<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>commit &#8211; Solr.pl</title>
	<atom:link href="https://solr.pl/en/tag/commit-2/feed/" rel="self" type="application/rss+xml" />
	<link>https://solr.pl/en/</link>
	<description>All things to be found - Blog related to Apache Solr &#38; Lucene projects - https://solr.apache.org</description>
	<lastBuildDate>Wed, 11 Nov 2020 19:49:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>When to commit?</title>
		<link>https://solr.pl/en/2011/06/27/when-to-commit/</link>
					<comments>https://solr.pl/en/2011/06/27/when-to-commit/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 27 Jun 2011 18:48:59 +0000</pubDate>
				<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[commit]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">http://sematext.solr.pl/?p=276</guid>

					<description><![CDATA[The question I asked myself recently what seems to be one of those for which the response should be quick and painless. So, when to send the commit command to Solr (or Lucene)? Despite the simplicity of the questions, the]]></description>
										<content:encoded><![CDATA[<p>The question I asked myself recently what seems to be one of those for which the response should be quick and painless. So, when to send the commit command to Solr (or Lucene)? Despite the simplicity of the questions, the answer is not clear, at least in my opinion.</p>
<p><span id="more-276"></span></p>
<p>To  answer the question of when to send the commit command, you must look  at several different variants of data indexing and how quickly you want  the data to be available on the slave servers. Looking at a typical implementations, which I had a pleasure to work with we can distinguish the following categories:</p>
<h3>Data can be made available only after a total index update</h3>
<p>The simplest situation theoretically and practically. We send the <em>commit </em>command only when you run out documents to be indexed.</p>
<h3>The data may be available in batches, without waiting for a full update of the index</h3>
<p>Here we have three possibilities:</p>
<ol>
<li>If it does not matter whether the data will be made available in batches or not, we can send the <em>commit </em>command after sending the last document.</li>
<li>If you want to share data in batches, our application can send a commit command from time to time.</li>
<li>If you do not want to send the commit commands from the indexing application, we can tell Solr to do it for us by setting up the autocommit mechanism.</li>
</ol>
<h3>Data must be indexed as fast as possible</h3>
<p>If your data should be indexed as fast as possible the commit operation should be sent only after sending all the data. Commit  is quite expensive in terms of performance and therefore, in this case,  should be used only at the end of the indexation process.</p>
<h3>It is important that the data should be published as soon as possible</h3>
<p>This is probably the most difficult of the described cases. It all depends on how quickly we want the data to be available on slave servers. For  example, in the case of CMS, when the user saves the edited page, we  want its updated content to be available right away &#8211; then commit after  every document, and fast replication is needed.When you add items to an online store, you may add some delay to commit and replication. Such cases can be multiplied indefinitely. But remember to set up your warming queries properly to prepare Solr fot the usual load during querying.<br />
Persons  interested in very frequent updating of the index should observe what  is happening in Lucene and Solr for NRT (near real time).</p>
<h3>Optimization</h3>
<p>It is worth remembering also to <em>optimize </em>the index. If  we send the commit command only once, at the end of the indexing is  worth considering whether or not to send optimize instead of <em>commit</em>. Our slaves will get an optimized version of the index along with the newest data.  Note, however, that the optimization of the index is longer than <em>commit</em>.</p>
<h3>Dangers</h3>
<p>It  is also worth remembering that the waiting indefinitely with commit  operations can lead to the danger of data loss that have not been  physically written to the index files. Of  course, nothing with the data does not happen if the Solr will be  properly turned off, while in case of machine failure situation we can  lost the data tha we were indexing since the last <em>commit </em>operation.</p>
<h3>To sum up</h3>
<p>As  you can see, there is no clear answer to when to send the <em>commit </em>command because it depends on the situation and individual needs. Note,  however, that the actions that are performed by Lucene / Solr after  sending the <em>commit</em> command is costly in terms of system resources. Do  not use this command frequently as instead of indexing data Lucene/Solr  may spend most of their time processing those commands.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2011/06/27/when-to-commit/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
