Solr.pl – Page 27 – All things to be found – Blog related to Apache Solr & Lucene projects

Waiting for 4.0: SOLR-2272 – Solr and JOIN functionality

Rafał Kuć Solr 21 February 201111 November 20202272, join, solr 0 Comment

Recently, my attention caught the functionality described in the SOLR-2272 Jira ticket – functionality of SQL JOIN implemented in Solr. In today’s entry I take a look at this functionality.

Index – delete or update?

Rafał Kuć Solr 16 February 201111 November 2020delete, index, solr 0 Comment

From time to time, in working with Solr there is a problem – how to update Solr index structure. There are various reasons for these changes – the new functional requirements, optimization, or anything else – it is not important. What is important is the question that arise – should we remove the index, or simply change the structure and do a full indexing? Contrary to appearances, the answer to this question depends on the changes we made in the structure of the index.

”Car sale” application – WordDelimiterFilter and PatternReplaceFilter, helping to improve search results (part 2)

Rafał Kuć Solr 14 February 201111 November 2020howto, schema.xml 0 Comment

In the first part of our ”Car sale” application related posts we created some standard index structure by properly configuring schema.xml configuration file. It didn’t take long to hear the first complains from the website users with this kind of configuration. Why don’t I receive any search results entering the “audi a” phrase ? I would like to see some announcements with “Audi A6” and “Audi A8” for example. I entered the phrase “Honda crv” – 0 results, “Suzuki maruti” – none. Are there no related offers in the announcement database ? There are! But the current configuration of the searchable field type (field “content” – type “text”) does not allow us to find those offers using the queries we’ve entered. That’s the reason why the WordDelimiterFilter and PatternReplaceFilter need to enter the battlefield.

Optimization – filter cache

Rafał Kuć Solr 7 February 201111 November 2020cache, caching, filter, filter cache, filterCache, filtering, solr 0 Comment

Today’s entry is dedicated to one type of cache in the Solr – filter cache. I will try to explain what it does, how to configure it and how to use it in an efficient way.

“Car sale application” – schema.xml designing to gain what we really need (part 1)

Rafał Kuć Solr 31 January 201111 November 2020howto, schema.xml, solr 0 Comment

One of the fundamental solr’s configuration file is the schema.xml file. It is a kind of connector between what we need and what solr understands. If we want to have a search engine, that gives us search results we really expect, then it is very important to properly design the schema.xml configuration file.
We would like to introduce you the first of the series of articles which will hopefully show us how to design schema.xml file and how to handle and modify all of the file’s components.

CheckIndex for the rescue

Rafał Kuć Lucene, Solr 17 January 201111 November 2020check, check index, checkindex, index, lucene, rescue 0 Comment

While using Lucene and Solr we are used to a very high reliability of this products. However, there may come the day when Solr will inform us that our index is corrupted, and we need to do something about it. Is the only way to repair the index is to restore it from the backup or do full indexation ? Not only – there is hope in the form of CheckIndex tool.

Optimization – query result window size

Rafał Kuć Solr 10 January 201111 November 2020cache, query, queryResultCache, queryResultWindowCache, result, size, window 0 Comment

Hereby I would like to start a small series of articles describing the elements of the optimization of Solr instances. At first glance I decided to describe the parameter that specifies the data fetch window size – the query result window size. Hopefully, this article will explain how to use this parameter, how to modify and adapt it to your needs.

Data Import Handler – removing data from index

Rafał Kuć Solr 3 January 201111 November 2020data import handler, databse, dih, integration 0 Comment

Deleting data from an index using DIH incremental indexing, on Solr wiki, is residually treated as something that works similarly to update the records. Similarly, in a previous article, I used this shortcut, the more that I have given an example of indexing wikipedia data that does not need to delete data.

Having at hand a sample data of the albums and performers, I decided to show my way of dealing with such cases. For simplicity and clarity, I assume that after the first import, the data can only decrease.

Data Import Handler – sharding

Marek Rogoziński Solr 27 December 201011 November 2020data import handler, database, dih, import, sharding 0 Comment

Our reader (greetings!) reported us a problem with the cooperation of DIH and sharding mechanism. The Solr project wiki, in my opinion, discuss the solution to this issue, but makes it a little around and on the occasion.

Wildcard queries and how Solr handles them

Rafał Kuć Solr 20 December 201010 November 2020wildcard 0 Comment

One of our readers reported a very interesting problem, which can be summarized to the following question – “Why doesn’t ReversedWildcardFilterFactory doesn’t work with Polish letters ?”. This entry will attempt to answer this question.