<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>word &#8211; Solr.pl</title>
	<atom:link href="https://solr.pl/en/tag/word-2/feed/" rel="self" type="application/rss+xml" />
	<link>https://solr.pl/en/</link>
	<description>All things to be found - Blog related to Apache Solr &#38; Lucene projects - https://solr.apache.org</description>
	<lastBuildDate>Wed, 11 Nov 2020 19:45:14 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>Solr filters: KeepWordFilter</title>
		<link>https://solr.pl/en/2011/05/02/solr-filters-keepwordfilter/</link>
					<comments>https://solr.pl/en/2011/05/02/solr-filters-keepwordfilter/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 02 May 2011 18:44:43 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[filter]]></category>
		<category><![CDATA[keep]]></category>
		<category><![CDATA[keepwordfilter]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[word]]></category>
		<guid isPermaLink="false">http://sematext.solr.pl/?p=264</guid>

					<description><![CDATA[This time I decided to look at one of the unusual filters available in the standard distribution of Solr. The first one in my hands is a filter called KeepWordFilter. Let&#8217;s start First, a few words about what this filter]]></description>
										<content:encoded><![CDATA[<p>This time I decided to look at one of the unusual filters available in the standard distribution of Solr. The first one in my hands is a filter called <em>KeepWordFilter</em>.</p>
<p><span id="more-264"></span></p>
<h3>Let&#8217;s start</h3>
<p>First, a few words about what this filter does. As the name might indicate the main purpose of this filter is to &#8220;stop&#8221;  words. More specifically, the filter does the opposite of filter called <em>StopFilter</em>. So how does this filter work ? I&#8217;ll talk about this in a moment &#8211; let&#8217;s start with the definition of the type and fields in the <em>schema.xml </em>file:
</p>
<pre class="brush:xml">&lt;fieldtype name="keepwords" class="solr.TextField"&gt;
   &lt;analyzer&gt;
      <code>&lt;</code><code>tokenizer</code> <code>class</code><code>=</code><code>"solr.WhitespaceTokenizerFactory"</code><code>/&gt;</code>
      &lt;filter class="solr.KeepWordFilterFactory" words="words.txt" ignoreCase="true"/&gt;
   &lt;/analyzer&gt;
&lt;/fieldtype&gt;</pre>
<p>As shown in the above definition in addition to the standard class and name attributes the filter has two additional attributes::</p>
<ul>
<li><em>words</em> &#8211; the list of words to keep</li>
<li><em>ignoreCase</em> &#8211; <em>true</em> | <em>false</em> value indicating case ignore functionality.</li>
</ul>
<h3>File contents</h3>
<p>Let&#8217;s assume that the <em>words.txt</em> file contain the following words:
</p>
<pre>ala
ma
kota</pre>
<p>If  you would like to index the phrase &#8220;Ala ma kota, a kot ma Alę&#8221; the  following tokens will be written into the index: &#8220;ala&#8221;, &#8220;ma&#8221;, &#8220;kota&#8221;,  &#8220;ma&#8221; because only those terms are defined in the words.txt file. This is clearly visible evident in the Solr administration panel:</p>
<p><a href="http://solr.pl/wp-content/uploads/2011/04/keepwords.png"><img fetchpriority="high" decoding="async" class="alignnone size-full wp-image-1198" title="keepwords" src="http://solr.pl/wp-content/uploads/2011/04/keepwords.png" alt="" width="626" height="493"></a></p>
<h3>A few words at the end</h3>
<p>Although I never used the filter it seems to me that this is a good filter to use when you need to store the values of&nbsp; enumerated types, or in situations where we are interested in finite, or even better &#8211; a small and known in advance list of values, such as the categories where we can not filter information at the application level, or when it is very difficult.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2011/05/02/solr-filters-keepwordfilter/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
