<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>solr &#8211; Solr.pl</title>
	<atom:link href="https://solr.pl/en/tag/solr-2/feed/" rel="self" type="application/rss+xml" />
	<link>https://solr.pl/en/</link>
	<description>All things to be found - Blog related to Apache Solr &#38; Lucene projects - https://solr.apache.org</description>
	<lastBuildDate>Sun, 09 Nov 2025 07:55:54 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>Apache Solr 9.10.0</title>
		<link>https://solr.pl/en/2025/11/09/apache-solr-9-10-0/</link>
					<comments>https://solr.pl/en/2025/11/09/apache-solr-9-10-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Sun, 09 Nov 2025 07:55:50 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1454</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.10.0. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.10.0</strong>.</p>



<span id="more-1454"></span>



<p>Some of the changes introduced in Solr <strong>9.10.0</strong>:</p>



<ul class="wp-block-list">
<li>Support for Java 24 and newer with Security Manager turned off.</li>



<li>Support for external Apache Tika server &#8211; we no longer need to process document in Solr process.</li>



<li>Apache Lucene upgrade &#8211; Solr now uses 9.12.3.</li>



<li>The <em>shards.preference=replica.location</em> parameter now supports <em>host</em> option.</li>



<li>A few functions marked as deprecated and scheduled for removal in Solr 10:
<ul class="wp-block-list">
<li>XLSXResponseWriter,</li>



<li>Apache Tika based language detection,</li>



<li>In process Apache Tika use in Apache Solr.</li>
</ul>
</li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_10_0/changes/Changes.html">https://solr.apache.org/docs/9_10_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.10.0</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2025/11/09/apache-solr-9-10-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.9.0</title>
		<link>https://solr.pl/en/2025/07/28/apache-solr-9-9-0/</link>
					<comments>https://solr.pl/en/2025/07/28/apache-solr-9-9-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 28 Jul 2025 21:04:28 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1432</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.9.0. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.9.0</strong>.</p>



<span id="more-1432"></span>



<p>Some of the changes introduced in Solr <strong>9.9.0</strong>:</p>



<ul class="wp-block-list">
<li>Starting from 9.9.0 Solr supports encoding text to vectors during indexing via external LLM services.</li>



<li>Function queries can use a numeric subset of Javascript and in additon to that access fileds and score value. All thank to the Expressions module of Apache Lucene.</li>



<li>Jetty&#8217;s Graceful Shutdown is now supported when <strong><em>SOLR_JETTY_GRACEFUL</em></strong> is set to <strong><em>true</em></strong> preventing Solr from disrupting queries during shutdown.</li>



<li>Various optimizations and bug fixes in SolrJ clients.</li>



<li>PKI Authentication supports caching and is more performant when working in high-throughput clusters.</li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_9_0/changes/Changes.html">https://solr.apache.org/docs/9_9_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.9.0</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>



<p></p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2025/07/28/apache-solr-9-9-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.8.1</title>
		<link>https://solr.pl/en/2025/04/10/apache-solr-9-8-1/</link>
					<comments>https://solr.pl/en/2025/04/10/apache-solr-9-8-1/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Thu, 10 Apr 2025 18:36:22 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1405</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.8.1. The 9.8.1 is a bug fix release.]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.8.1</strong>.</p>



<span id="more-1405"></span>



<p>The 9.8.1 is a bug fix release. The full list of changes can be found <a href="https://solr.apache.org/docs/9_8_1/changes/Changes.html">here</a>.</p>



<p>You can download Solr 9.8.1 <a href="https://lucene.apache.org/solr/downloads.html">here</a>.</p>



<p></p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2025/04/10/apache-solr-9-8-1/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.8.0</title>
		<link>https://solr.pl/en/2025/01/23/apache-solr-9-8-0/</link>
					<comments>https://solr.pl/en/2025/01/23/apache-solr-9-8-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Thu, 23 Jan 2025 19:33:38 +0000</pubDate>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1398</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.8. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.8</strong>.</p>



<span id="more-1398"></span>



<p>Some of the changes introduced in Solr <strong>9.8</strong>:</p>



<ul class="wp-block-list">
<li>Solr cross data center feature graduated into a main Solr feature!</li>



<li>A give request may now be limited when it comes to the amount of memory it can use using the <em>memAllowed</em> parameter.</li>



<li>The <em>lib</em> tags in <em>solrconfig.xml</em> are now silently ignored unless you include the <em>SOLR_CONFIG_LIB_ENABLED</em> environment variable set to <em>true</em>. </li>



<li>A new parser called <em>knn_text_to_vector</em> was added allowing to calculate text embeddings using external LLMs.</li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_8_0/changes/Changes.html">https://solr.apache.org/docs/9_8_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.8</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2025/01/23/apache-solr-9-8-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr &#038; Embeddings &#8211; How to Start?</title>
		<link>https://solr.pl/en/2024/11/18/apache-solr-embeddings-how-to-start/</link>
					<comments>https://solr.pl/en/2024/11/18/apache-solr-embeddings-how-to-start/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Mon, 18 Nov 2024 07:44:02 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[embeddings]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1369</guid>

					<description><![CDATA[Semantic search and everything related to machine learning has become a very popular topic. To be honest, it’s not only semantic search itself but, due to the massive popularity of so-called Large Language Models and more, many organizations using Solr]]></description>
										<content:encoded><![CDATA[
<p>Semantic search and everything related to machine learning has become a very popular topic. To be honest, it’s not only semantic search itself but, due to the massive popularity of so-called Large Language Models and more, many organizations using Solr are trying to implement various query logic based on machine models, RAG techniques, or rescoring. Since it’s been relatively quiet on this front for a while, it’s time to delve into the topic. In this post, we’ll focus on search and explore what Solr has to offer for data retrieval based on vectors.</p>



<span id="more-1369"></span>



<h2 class="wp-block-heading">Short Problem Description</h2>



<p>Let’s assume that our documents consist of a series of fields used for full-text search, along with a field that we want to use for semantic search. This field contains tags that describe the document.</p>



<p>It’s important to highlight that this is not an ideal data example—for instance, we won’t be using context, the tags themselves are single words, and we also want to build a single vector for the document, which doesn’t make the best use of tags. Nonetheless, we’ll proceed with this approach and index the prepared data in Solr.</p>



<h2 class="wp-block-heading">Solr Preparation</h2>



<p>Preparing the data is a bit more complex. We won’t create our own model but will instead use one available on Hugging Face, specifically <a href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2">https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2</a>. This is one of many models that enable work with sentences and paragraphs. It’s important to remember that calculating the vector involves not only analyzing the words or the sentence itself but also the context in which they occur. The resulting embedding encodes all the relevant information, allowing algorithms to find similar documents. We’ll take a shortcut and use this model to process the tags field, but as I mentioned earlier—the model isn’t the main focus today; we’re interested in Solr.</p>



<p>To do this, we’ll use Solr 9.7.0 and a field based on the <strong>solr.DenseVectorField</strong> class. We’ll start by creating a new field type that looks like this:</p>



<pre class="wp-block-code"><code class="">{
  "name":"knn_vector_384",
  "class":"solr.DenseVectorField",
  "vectorDimension":384,
  "similarityFunction":"cosine",
  "knnAlgorithm":"hnsw"
}</code></pre>



<p>Here, we have the name <strong>knn_vector_384</strong>, the class <strong>solr.DenseVectorField</strong>, a vector dimension of 384, the cosine similarity method for vector comparison, and the proximity calculation algorithm &#8211; hnsw &#8211; which is currently the only one available. The full documentation for the <strong>solr.DenseVectorField</strong> class can be found in the <a href="https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html">official documentation</a>. It’s worth mentioning that we’ll be using a model that generates 384-dimensional vectors, which is why we define our type in this way.</p>



<p>Our documents will consist of five fields:</p>



<ul class="wp-block-list">
<li>identifier (<em>id</em> field),</li>



<li>name (<em>name</em> field),</li>



<li>embeddings (<em>vector</em> field),</li>



<li>product category (<em>category</em> field),</li>



<li>product tags (<em>tags</em> field).</li>
</ul>



<p>To create the test collection and prepare it for indexing the documents we can use the following commands:</p>



<pre class="wp-block-code"><code class="">$ bin/solr create -c test -s 1 -rf 1

$ curl -XPOST -H 'Content-type:application/json' 'http://localhost:8983/solr/test/schema' --data-binary '{
  "add-field-type" : {
    "name":"knn_vector_384",
    "class":"solr.DenseVectorField",
    "vectorDimension":384,
    "similarityFunction":"cosine",
    "knnAlgorithm":"hnsw"
  },
  "add-field" : [
      {
        "name":"vector",
        "type":"knn_vector_384",
        "indexed":true,
        "stored":false
      },
      {
        "name":"name",
        "type":"text_general",
        "multiValued":false,
        "indexed":true,
        "stored":true
      },
      {
        "name":"category",
        "type":"string",
        "multiValued":false,
        "indexed":true,
        "stored":true
      },
      {
        "name":"tags",
        "type":"string",
        "multiValued":true,
        "indexed":true,
        "stored":true
      }
    ]
}'</code></pre>



<h2 class="wp-block-heading">Data Preparation</h2>



<p>Preparing the data is a bit more complex. Instead of creating our own model, we’ll use one of the available models from Hugging Face, specifically the model <a href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2">sentence-transformers/all-MiniLM-L6-v2</a>. This is one of many models capable of processing sentences and paragraphs. We’ll take a shortcut and use this model to process the field with tags, but as I mentioned before, the model itself isn’t our main focus today—we’re focused on Solr.</p>



<p>To index the documents we will use the following Python code that we store in a file called <strong>index.py</strong>:</p>



<pre class="wp-block-code"><code class="">import pysolr
import uuid
from typing import List, Dict
import torch
from transformers import AutoTokenizer, AutoModel
import numpy as np

class DocumentEmbedder:
    def __init__(self):
        self.model_name = "sentence-transformers/all-MiniLM-L6-v2"
        self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
        self.model = AutoModel.from_pretrained(self.model_name)
        
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.model.to(self.device)

    def mean_pooling(self, model_output, attention_mask):
        token_embeddings = model_output[0]
        input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
        return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

    def get_embedding(self, text: str) -&gt; np.ndarray:
        encoded_input = self.tokenizer(
            text,
            padding=True,
            truncation=True,
            max_length=512,
            return_tensors='pt'
        )
        
        encoded_input = {k: v.to(self.device) for k, v in encoded_input.items()}

        with torch.no_grad():
            model_output = self.model(**encoded_input)

        sentence_embeddings = self.mean_pooling(model_output, encoded_input['attention_mask'])
        sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
        
        return sentence_embeddings.cpu().numpy()

class SolrIndexer:
    def __init__(self, solr_url: str = 'http://localhost:8983/solr/test'):
        self.solr = pysolr.Solr(solr_url, always_commit=True)
        self.embedder = DocumentEmbedder()

    def create_document(self, name: str, tags: str, category: str) -&gt; Dict:
        """Create a document with embeddings."""
        embedding = self.embedder.get_embedding(tags)
        
        doc = {
            'id': str(uuid.uuid4()),
            'name': name,
            'tags': tags.split(', '),  
            'category': category,
            'vector': embedding.flatten().tolist()
        }
        return doc

    def index_documents(self, documents: List[Dict]):
        """Index multiple documents to Solr."""
        try:
            self.solr.add(documents)
            print(f"Successfully indexed {len(documents)} documents")
        except Exception as e:
            print(f"Error indexing documents: {str(e)}")

def main():
    documents = [
        {"name": "Apple iPhone 13", "tags": "phone, smartphone, screen, iOS", "category": "phone"},
        {"name": "Apple iPhone 14", "tags": "phone, smartphone, screen, iOS", "category": "phone"},
        {"name": "Apple iPhone 15", "tags": "phone, smartphone, screen, iOS", "category": "phone"},
        {"name": "Samsung Galaxy S24", "tags": "phone, smartphone, screen, Android", "category": "phone"},
        {"name": "Apple iPod", "tags": "music, screen, iOS", "category": "music player"},
        {"name": "Samsung Microwave", "tags": "kitchen, cooking, electric", "category": "household"}
    ]
    
    indexer = SolrIndexer()
    
    solr_documents = []
    for doc in documents:
        solrdoc = indexer.create_document(doc['name'], doc['tags'], doc['category'])
        solr_documents.append(solrdoc)
        print(f"Created document: {solrdoc['name']}")
        print(f"Vector length: {len(solrdoc['vector'])}")
        
    indexer.index_documents(solr_documents)

    embedder = DocumentEmbedder()
    embeddings = embedder.get_embedding("song player")
    print(f"Embeddings: {str(embeddings)}")

if __name__ == "__main__":
    main()


</code></pre>



<p>A few comments about the code:</p>



<ul class="wp-block-list">
<li>The <strong>DocumentEmbedder</strong> class is responsible for creating the vector; if we want to switch models, we only need to change the value of <strong>self.model_name</strong>. However, it’s important to remember that we configured Solr to work with 384-dimensional vectors, so if the new model produces vectors with different characteristics, a different Solr configuration will be required.</li>



<li>The <strong>SolrIndexer</strong> class handles the indexing of documents.</li>



<li>The documents to be indexed are defined in the <strong>documents</strong> variable.</li>
</ul>



<p>To run the code, there are dependencies that must be installed. We tested the code on a MacOS system running on an ARM-based processor. The <strong>requirements.txt</strong> file looks as follows:</p>



<pre class="wp-block-code"><code class="">transformers==4.37.2
torch&gt;=2.2.0
numpy&gt;=1.24.3
pandas&gt;=2.1.4
sentencepiece==0.1.99 --extra-index-url https://download.pytorch.org/whl/cpu
pysolr==3.9.0
uuid==1.30
</code></pre>



<p>You can install the dependencies and run the code using the following commands:</p>



<pre class="wp-block-code"><code class="">$ pip install -r requirements.txt

$ python index.py</code></pre>



<h2 class="wp-block-heading">Querying</h2>



<p>When the documents are successfully indexed we can start running the queries to retrieve the documents. For the semantic search we can use one of the two available parsers:</p>



<ul class="wp-block-list">
<li><em>knn</em></li>



<li><em>vectorSimilarity</em></li>
</ul>



<p>When using the <strong>knn</strong> parser, we get the top K documents, whereas with the <strong>vectorSimilarity</strong> parser, we retrieve documents that exceed a specified vector similarity threshold. An example query looks like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/query?q={!knn f=vector topK=10}[...]</code></pre>



<p>As shown, the above query uses the <strong>knn</strong> parser, operates on data from the field named vector, retrieves 10 documents, and in the <strong>[&#8230;]</strong> section, our 384-dimensional vector should be placed. How can we generate it? We can modify the previous code and calculate the vector for our query in the <strong>main</strong> function:</p>



<pre class="wp-block-code"><code class="">.
.
.

def main():
    embedder = DocumentEmbedder()
    embeddings = embedder.get_embedding("song player").flatten().tolist()
    print(f"Embeddings: {str(embeddings)}")

if __name__ == "__main__":
    main()</code></pre>



<p>The generated vector can now be used in Solr. Let’s try querying the <strong>tags</strong> field with <strong>song player</strong>, like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/select?q=tags:"song player"</code></pre>



<p>The results, as expected will be empty:</p>



<pre class="wp-block-code"><code class="">{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":1,
    "params":{
      "q":"tags:\"song player\""
    }
  },
  "response":{
    "numFound":0,
    "start":0,
    "numFoundExact":true,
    "docs":[ ]
  }
}</code></pre>



<p>This is the situation we might have expected &#8211; we don’t have a document with that exact tag. Our <strong>Apple iPod</strong> document has one of its <strong>tags</strong> field values set to <strong>music</strong>. Close, but not the same as <strong>song player</strong>, at least from the perspective of full-text search, and we aren’t using any synonyms.</p>



<p>So, let’s see how semantic search handles this. To do this, we need to generate a vector from our phrase. Using the code we’ve seen above, our <strong>song player</strong> query, after vector generation, will look like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/query?fl=name,tags,score,*&amp;q={!knn f=vector topK=10}[-2.00312417e-02,-8.43894295e-03,4.63341828e-03,-7.11187869e-02,-5.41122817e-02,8.86119753e-02,1.43207222e-01,6.48824051e-02,-2.42583733e-02,5.58690503e-02,1.17570441e-02,-8.10808595e-03,5.87443374e-02,-8.77943709e-02,-2.66626403e-02,9.57359672e-02,-3.80967893e-02,6.07994124e-02,-2.05643158e-02,-4.67214771e-02,-1.52253941e-01,-9.71766468e-03,-6.08050935e-02,2.88447887e-02,-2.94179078e-02,1.03829138e-01,-8.31280462e-03,1.05928726e-01,-5.67206480e-02,-7.29391426e-02,8.77882075e-03,3.76622789e-02,1.22778863e-01,7.63954269e-03,-1.07958838e-01,-2.39758939e-02,-4.40737829e-02,-1.58879627e-02,-8.71627405e-02,-1.86017044e-02,-1.60255414e-02,3.48703228e-02,-3.19783390e-02,-1.32465595e-03,-6.66615218e-02,-5.59806228e-02,-3.83943357e-02,-3.26939276e-03,3.50921564e-02,8.62423256e-02,-4.67025265e-02,3.27739231e-02,1.36479884e-02,2.14965921e-02,9.94029175e-03,8.50510597e-03,6.12449907e-02,1.26747936e-01,1.20850094e-02,8.25305879e-02,-3.12836142e-03,-7.23205507e-02,-1.25491228e-02,-4.12022136e-02,8.32084790e-02,-4.56760749e-02,1.24245733e-02,2.99928896e-03,-1.72226392e-02,1.40204923e-02,-5.27366474e-02,8.62174854e-02,-1.23909842e-02,-5.11718355e-03,-5.19047305e-03,-3.26551199e-02,3.38410065e-02,-2.57184170e-02,4.93465960e-02,2.16306932e-02,5.44718355e-02,-5.13004661e-02,-6.29040822e-02,-1.14791520e-01,2.42656711e-02,-5.12294583e-02,-1.41068548e-02,-1.69403590e-02,-3.93591821e-02,2.21420880e-02,-7.29445517e-02,3.97052318e-02,4.06186096e-02,2.38778368e-02,-2.60719825e-02,4.29015867e-02,8.55240896e-02,-1.29460618e-01,-6.10361844e-02,1.73135296e-01,5.50046675e-02,2.11313833e-02,8.09541941e-02,8.49610151e-05,1.86103079e-02,-7.11245313e-02,-2.20125029e-03,8.21124241e-02,-3.17892991e-02,-5.61453328e-02,3.08496933e-02,-2.24805139e-02,-6.99905232e-02,-1.38647379e-02,2.40420792e-02,1.32726450e-02,3.10667846e-02,1.00519679e-01,-3.77691817e-03,1.09965838e-02,-2.87625156e-02,-3.69284451e-02,-4.66967747e-02,1.17862746e-02,-6.05449453e-02,1.80788971e-02,-7.05301017e-03,-2.43471990e-33,4.00696788e-03,-8.39618146e-02,4.76996899e-02,4.31838222e-02,6.59582578e-03,-2.13763528e-02,1.40098054e-02,4.72629964e-02,-7.40025416e-02,-1.17481779e-02,4.74894270e-02,-2.08981428e-02,-4.10624146e-02,6.36240691e-02,1.19205341e-01,-6.63071573e-02,-5.37127964e-02,1.66804232e-02,-5.87310642e-03,-1.00764409e-02,2.95753870e-02,7.51189962e-02,3.69122699e-02,4.02592048e-02,3.01825050e-02,-3.49015556e-02,3.05062942e-02,-5.49735427e-02,7.76632428e-02,-1.31180901e-02,1.32043369e-03,-7.29682148e-02,4.55190837e-02,-3.90759930e-02,-2.00935621e-02,2.90088002e-02,-4.90487851e-02,-4.68467623e-02,-3.78382057e-02,-8.62234011e-02,-8.21669679e-03,-1.02636190e-02,-4.22154590e-02,-1.01420805e-02,-9.29822177e-02,-4.51174043e-02,1.08679747e-02,4.72158194e-02,1.13880262e-02,2.26492099e-02,1.08083403e-02,4.73796055e-02,-2.11352482e-02,2.94824075e-02,3.32471356e-02,-6.10656738e-02,6.39934689e-02,7.52783706e-03,2.01779045e-02,8.42025783e-03,7.92021900e-02,7.49795884e-02,3.60657759e-02,-5.19200675e-02,-2.61804424e-02,4.30282280e-02,9.71625224e-02,-5.95051460e-02,9.75757018e-02,2.11117323e-02,-3.37768793e-02,-2.30997894e-02,5.96088655e-02,3.78986225e-02,-8.59187767e-02,2.64272131e-02,-5.98726794e-02,-7.63988122e-02,-8.67564604e-03,-4.02248465e-02,-7.74084628e-02,-7.17599411e-03,-6.49854094e-02,5.07407561e-02,-4.55319695e-02,-2.79121399e-02,-3.54578048e-02,-8.02342817e-02,-1.16430726e-02,-3.30321328e-03,-2.42636316e-02,4.07439955e-02,8.91774427e-03,1.66885648e-02,-5.11291474e-02,2.29379760e-33,-2.25291476e-02,-3.75496261e-02,7.23349378e-02,2.80879121e-02,1.11529857e-01,9.52087902e-03,3.09992954e-02,3.44425067e-02,-1.44962538e-02,3.70973274e-02,-3.42840664e-02,-4.15693037e-03,4.41893972e-02,-5.78483054e-03,2.27168929e-02,2.73053981e-02,-7.61534553e-03,3.05881090e-02,2.22553536e-02,-2.11785771e-02,-8.38738605e-02,-3.54573503e-02,5.55671491e-02,-3.25255133e-02,-6.17663078e-02,-3.31615508e-02,1.28180429e-01,-1.09783243e-02,-4.22303416e-02,8.85861460e-03,1.03141055e-01,-1.19012371e-02,-5.68600930e-02,-9.76827294e-02,-3.66187817e-03,6.69727027e-02,5.01095466e-02,3.45795639e-02,-4.87284325e-02,-2.88722366e-02,3.69899273e-02,4.80774455e-02,9.51483287e-03,1.05140977e-01,9.12032463e-03,7.83540029e-03,3.00276410e-02,1.10446520e-01,2.02773716e-02,6.35717139e-02,-6.48668930e-02,-3.63331586e-02,5.26458211e-02,-8.43572319e-02,-6.71359748e-02,-3.95149644e-03,-3.19298953e-02,-3.83070633e-02,-1.30710788e-02,3.45022231e-02,3.50211672e-02,-1.17928414e-02,-2.96711233e-02,-1.07177338e-02,-3.55232134e-02,1.01115681e-01,3.25268582e-02,3.25036608e-02,-1.87526215e-02,-2.26221383e-02,1.12875113e-02,5.46618700e-02,-9.37254541e-03,5.56712523e-02,-4.75139245e-02,8.30354728e-03,-1.24001637e-01,7.75059313e-02,-2.27639657e-02,-7.71744251e-02,3.50985602e-02,-1.18714431e-02,2.85322741e-02,-1.23035219e-02,3.05858813e-03,-3.02043278e-02,7.86332041e-02,2.63012294e-02,-2.10798401e-02,-2.86972001e-02,4.53165434e-02,5.47545776e-02,-1.12766981e-01,4.01742896e-03,-7.09661469e-03,-1.23211663e-08,-8.58379975e-02,1.30043477e-02,2.08473746e-02,-4.64787520e-02,5.16857654e-02,1.19483555e-02,3.52647863e-02,-1.09199435e-01,3.83973494e-02,3.78849730e-02,3.93284820e-02,-7.54635110e-02,8.52666888e-03,7.53920805e-03,2.69943215e-02,-2.04220396e-02,-5.33770137e-02,9.14978534e-02,-5.69638461e-02,4.34034877e-02,1.58696324e-02,6.27059937e-02,1.16019472e-02,-3.03725917e-02,2.53463928e-02,-1.08486209e-02,-4.15410772e-02,6.96198866e-02,1.25364019e-02,8.21540307e-04,3.80332284e-02,7.23319054e-02,2.98110046e-03,-6.22513592e-02,6.56076372e-02,2.09525451e-02,2.19415948e-02,-1.34995428e-03,-9.06020775e-02,2.12142467e-02,-1.63224037e-03,9.72291678e-02,-3.10167447e-02,-3.09801828e-02,-1.90541670e-02,-3.96139771e-02,2.09502075e-02,-9.64867976e-03,1.07652992e-02,2.92998832e-02,-6.14691451e-02,2.42477246e-02,-4.87532094e-02,2.83491425e-02,8.26478079e-02,4.19540368e-02,-3.42742465e-02,5.80971912e-02,-2.19746046e-02,1.28769483e-02,-1.10125542e-02,3.03942598e-02,2.98435502e-02,5.43411188e-02]</code></pre>



<h2 class="wp-block-heading">The Results</h2>



<p>Our query that we just run returns the following results:</p>



<pre class="wp-block-code"><code class="">{
  "responseHeader":{
    .
    .
    .
  },
  "response":{
    "numFound":6,
    "start":0,
    "maxScore":0.7036504,
    "numFoundExact":true,
    "docs":[{
      "id":"0030eb55-95ee-4042-b574-1f1f4d2d0dd4",
      "name":"Apple iPod",
      "tags":["music","screen","iOS"],
      "_version_":1815543355046625280,
      "_root_":"0030eb55-95ee-4042-b574-1f1f4d2d0dd4",
      "score":0.7036504
    },{
      "id":"f5122858-22bf-47fe-84b4-f0fa5e608d22",
      "name":"Samsung Galaxy S24",
      "tags":["phone","smartphone","screen","Android"],
      "_version_":1815543355045576704,
      "_root_":"f5122858-22bf-47fe-84b4-f0fa5e608d22",
      "score":0.5939434
    },{
      "id":"9b6aa3b5-f02a-4969-ae21-daaafc58ece4",
      "name":"Apple iPhone 13",
      "tags":["phone","smartphone","screen","iOS"],
      "_version_":1815543355005730816,
      "_root_":"9b6aa3b5-f02a-4969-ae21-daaafc58ece4",
      "score":0.5706577
    },{
      "id":"bbb744ae-ca90-4663-8a8e-fdbe8e00d935",
      "name":"Apple iPhone 14",
      "tags":["phone","smartphone","screen","iOS"],
      "_version_":1815543355042430976,
      "_root_":"bbb744ae-ca90-4663-8a8e-fdbe8e00d935",
      "score":0.5706577
    },{
      "id":"f750c960-229b-4fd2-a9f7-c891afbb19d3",
      "name":"Apple iPhone 15",
      "tags":["phone","smartphone","screen","iOS"],
      "_version_":1815543355044528128,
      "_root_":"f750c960-229b-4fd2-a9f7-c891afbb19d3",
      "score":0.5706577
    },{
      "id":"756e14fc-6320-4fee-9713-7d0c6123fa8a",
      "name":"Samsung Microwave",
      "tags":["kitchen","cooking","electric"],
      "_version_":1815543355047673856,
      "_root_":"756e14fc-6320-4fee-9713-7d0c6123fa8a",
      "score":0.5591891
    }]
  }
}</code></pre>



<p>As we can see, we retrieved all the documents, with our <strong>Apple iPod</strong> having the highest score, which is now calculated based not on document relevance to the query but on vector similarity. In our case, this is a very limited sample of documents and a tag field that is far from ideal, but we have a functioning, relatively simple example. The <strong>knn</strong> parser returned the top 10 documents, which, in our case, means all of them.</p>



<p>If we only want to retrieve documents with a score above a certain threshold, we can use the <strong>vectorSimilarity</strong> parser. For example, if we want to fetch documents with a similarity score equal to or greater than <strong>0.7</strong>, our query would look like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/query?q={!vectorSimilarity f=vector minReturn=0.7}[...]</code></pre>



<p>And the results of the above query would look as follows:</p>



<pre class="wp-block-code"><code class="">{
  "responseHeader":{
    .
    .
    .
  },
  "response":{
    "numFound":1,
    "start":0,
    "maxScore":0.7036504,
    "numFoundExact":true,
    "docs":[{
      "id":"0030eb55-95ee-4042-b574-1f1f4d2d0dd4",
      "name":"Apple iPod",
      "tags":["music","screen","iOS"],
      "_version_":1815543355046625280,
      "_root_":"0030eb55-95ee-4042-b574-1f1f4d2d0dd4",
      "score":0.7036504
    }]
  }
}</code></pre>



<p>And now we only have a single document with the score higher than 0.7.</p>



<h2 class="wp-block-heading">Filtering</h2>



<p>The performance of the solution will depend on the number of candidates—that is, the documents that Solr will need to process. To reduce this number, we can use filtering with the <strong>preFilter</strong> parameter. For example, if we only want to retrieve phones (where the <strong>category</strong> field is equal to <strong>phone</strong>), our query would look like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/query?q={!knn f=vector topK=10 preFilter=category:phone}[...]</code></pre>



<p>This time the results would look as follows:</p>



<pre class="wp-block-code"><code class="">{
  "responseHeader":{
    .
    .
    .
  },
  "response":{
    "numFound":4,
    "start":0,
    "maxScore":0.5939434,
    "numFoundExact":true,
    "docs":[{
      "id":"a0f6fe9b-e82a-437e-b8ef-c12b6ee8d4a3",
      "name":"Samsung Galaxy S24",
      "tags":["phone","smartphone","screen","Android"],
      "category":"phone",
      "_version_":1815544743947403264,
      "_root_":"a0f6fe9b-e82a-437e-b8ef-c12b6ee8d4a3",
      "score":0.5939434
    },{
      "id":"ff2a471f-6d31-4dc1-9ec0-f5eacbfc1540",
      "name":"Apple iPhone 13",
      "tags":["phone","smartphone","screen","iOS"],
      "category":"phone",
      "_version_":1815544743943208960,
      "_root_":"ff2a471f-6d31-4dc1-9ec0-f5eacbfc1540",
      "score":0.5706577
    },{
      "id":"9d2955b2-ba6d-4f5d-a477-66909725b362",
      "name":"Apple iPhone 14",
      "tags":["phone","smartphone","screen","iOS"],
      "category":"phone",
      "_version_":1815544743945306112,
      "_root_":"9d2955b2-ba6d-4f5d-a477-66909725b362",
      "score":0.5706577
    },{
      "id":"93e5bd31-78b5-45b9-9c61-a0d8e8c0badb",
      "name":"Apple iPhone 15",
      "tags":["phone","smartphone","screen","iOS"],
      "category":"phone",
      "_version_":1815544743946354688,
      "_root_":"93e5bd31-78b5-45b9-9c61-a0d8e8c0badb",
      "score":0.5706577
    }]
  }
}</code></pre>



<p>This time, we only received phones, meaning the documents were properly filtered.</p>



<p>The behavior of Solr is also defined by the query. For example, if the <strong>preFilter</strong> parameter is absent, Solr will use standard parameters known from full-text search, such as <strong>fq</strong>. We can construct our filtered query using the <strong>fq</strong> parameter, and it would look like this:</p>



<pre class="wp-block-code"><code class="">http://localhost:8983/solr/test/query?fq=category:phone&amp;q={!knn f=vector topK=10}[...]</code></pre>



<p>The results would be the same &#8211; four documents which we already saw.</p>



<h2 class="wp-block-heading">Summary</h2>



<p>We’ve seen some of the capabilities Solr offers when it comes to using vectors. We’ve explored what the two parsers allow and how to use them. Of course, this example is very simple, and consequently, if we want to start working on our own projects, first we need to invest time in finding or preparing a model, or alternatively, finding a model and fine-tuning it. The model used highly depends on the use case. There are many models available that can be used, such as the <a href="https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5">Snowflake/snowflake-arctic-embed-m-v1.5</a> model, or published models like <a href="https://huggingface.co/Marqo/marqo-ecommerce-embeddings-B">Marqo/marqo-ecommerce-embeddings-B</a> or <a href="https://huggingface.co/Marqo/marqo-ecommerce-embeddings-L">Marqo/marqo-ecommerce-embeddings-L</a>. It’s important to spend time selecting the right model, testing it, and evaluating its performance. In semantic search, model selection is crucial because it is responsible for creating the vectors that will be used for searching.</p>



<p></p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/11/18/apache-solr-embeddings-how-to-start/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 8.11.4</title>
		<link>https://solr.pl/en/2024/09/29/apache-solr-8-11-4/</link>
					<comments>https://solr.pl/en/2024/09/29/apache-solr-8-11-4/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Sun, 29 Sep 2024 07:54:58 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1346</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 8.x branch and it is numbered 8.11.4. The 8.11.4 is a bug fix release.]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 8.x branch and it is numbered <strong>8.11.4</strong>.</p>



<span id="more-1346"></span>



<p>The 8.11.4 is a bug fix release. The full list of changes can be found <a href="https://solr.apache.org/docs/8_11_4/changes/Changes.html">here</a>.</p>



<p>You can download Solr 8.11.4 <a href="https://lucene.apache.org/solr/downloads.html">here</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/09/29/apache-solr-8-11-4/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.7.0</title>
		<link>https://solr.pl/en/2024/09/10/apache-solr-9-7-0/</link>
					<comments>https://solr.pl/en/2024/09/10/apache-solr-9-7-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Tue, 10 Sep 2024 06:32:47 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1341</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.7. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.7</strong>.</p>



<span id="more-1341"></span>



<p>Some of the changes introduced in Solr <strong>9.7</strong>:</p>



<ul class="wp-block-list">
<li>DocValues are now turned on by default on fields that support that, but only when using the newest schema. </li>



<li>New query parser was introduced &#8211; the <strong>vectorSimilarity</strong> allowing minimum vector similarity threshold.</li>



<li>It is now possible to use multiple threads when having index built of multiple segments.</li>



<li>Solr was upgraded to Lucene 9.11.1.</li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_7_0/changes/Changes.html">https://solr.apache.org/docs/9_7_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.7</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/09/10/apache-solr-9-7-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.6.0</title>
		<link>https://solr.pl/en/2024/05/05/apache-solr-9-6-0/</link>
					<comments>https://solr.pl/en/2024/05/05/apache-solr-9-6-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Sun, 05 May 2024 07:55:31 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1326</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.6. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.6</strong>.</p>



<span id="more-1326"></span>



<p>Some of the changes introduced in Solr <strong>9.6</strong>:</p>



<ul class="wp-block-list">
<li>SolrJ offers improved asynchronous API and exposes a light client &#8211; <em>HttpJdkSolrClient.</em></li>



<li>The distributed IDF calculation can now be disabled via request param.</li>



<li>It is possible to removed inactivate shards that are a result of shard split.</li>



<li>Solr was upgraded to Lucene 9.10.0.</li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_6_0/changes/Changes.html">https://solr.apache.org/docs/9_6_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.6</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/05/05/apache-solr-9-6-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 9.5.0</title>
		<link>https://solr.pl/en/2024/02/21/apache-solr-9-5-0/</link>
					<comments>https://solr.pl/en/2024/02/21/apache-solr-9-5-0/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Wed, 21 Feb 2024 19:18:34 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1318</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered 9.5. Some of the changes introduced in Solr]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 9.x branch and it is numbered <strong>9.5</strong>.</p>



<span id="more-1318"></span>



<p>Some of the changes introduced in Solr <strong>9.5</strong>:</p>



<ul class="wp-block-list">
<li>Support for node-level memory and CPU circuit breakers. </li>



<li>Collection and replica properties may now be used as property substitutions in configuration files.</li>



<li>Starting with Solr 9.5 Solr releases now produce OpenAPI specifications covering many of the v2 APIs. </li>



<li>Quality of life improvements to tracing support. </li>
</ul>



<p>We encourage you to read the whole list of changes at: <a href="https://solr.apache.org/docs/9_5_0/changes/Changes.html">https://solr.apache.org/docs/9_5_0/changes/Changes.html</a>.</p>



<p>Apache Solr<strong> 9.5</strong> can be downloaded from <a href="https://dlcdn.apache.org/solr/">https://dlcdn.apache.org/solr/</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/02/21/apache-solr-9-5-0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Apache Solr 8.11.3</title>
		<link>https://solr.pl/en/2024/02/17/apache-solr-8-11-3/</link>
					<comments>https://solr.pl/en/2024/02/17/apache-solr-8-11-3/#respond</comments>
		
		<dc:creator><![CDATA[Rafał Kuć]]></dc:creator>
		<pubDate>Sat, 17 Feb 2024 08:00:29 +0000</pubDate>
				<category><![CDATA[Solr]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[solr]]></category>
		<guid isPermaLink="false">https://solr.pl/?p=1304</guid>

					<description><![CDATA[It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 8.x branch and it is numbered 8.11.3. The 8.11.3 is a bug fix release]]></description>
										<content:encoded><![CDATA[
<p>It is a pleasure to inform you that the new version of the Solr search server has been released. It is the next release from the 8.x branch and it is numbered <strong>8.11.3</strong>.</p>



<span id="more-1304"></span>



<p>The 8.11.3 is a bug fix release along with some security fixes including:</p>



<ul class="wp-block-list">
<li>Transaction log no longer growing indefinitely on TLOG replicas.</li>



<li>Using Schema and Config APIs no longer breaks file upload.</li>



<li>HEAD request for managed resources doesn&#8217;t result in 500 error anymore. </li>
</ul>



<p>The full list of changes can be found <a href="https://solr.apache.org/docs/8_11_3/changes/Changes.html">here</a>.</p>



<p>You can download Solr 8.11.3 <a href="https://lucene.apache.org/solr/downloads.html">here</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://solr.pl/en/2024/02/17/apache-solr-8-11-3/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
