In the article on how to import data (http://solr.pl/2010/09/06/solr-data-indexing-for-fun-and-profit/?lang=en) I mentioned the Data Import Handler (DIH). The main advantage of this method of data importing is no need for additional software development and the rapid integration of the data source. This second advantage, however, requires skill and practice. In this entry I’ll show you the basics of DIH integration with SQL data source.
Solr
Quick look – IndexSorter
At the Apache Lucene Eurocon 2010 conference, which took place in May this year, Andrew Białecki in his presentation talked about how to obtain satisfactory search results when using early termination search techniques. Unfortunately the tool he mentioned, was not available in Solr – but it changed.
Quick look – Solritas
While observing Solr mailing lists we can spot a functionality called Solritas. Sounds strange ? What kind of functionality is it ? How we can use it ? To see the answers to these questions, I invite you to read the rest of the entry.
Quick look – FieldCollapsing
FieldCollapsing, or in other words grouping of search results has just been commited to the svn repository. I decided to take a look at this functionality and see how it works.
6 sins of solrconfig.xml modifications
Solrconfig.xml file is another file that defines the behavior Solr. Unlike a file that describes the structure of the index file solrconfig.xml determines the functionality available in Solr. Just like in the case schema.xml file we can distinguish a number of standard mistakes made by those who implement Solr, and I’m not talking only about people who have little experience with Solr. In order to learn some of those mistakes I invite you to read the following entry.
Solr: data indexing for fun and profit
Solr is not very friendly to novice users. Preparing good schema file requires some experience. Assuming that we have prepared the configuration files, what remains for us is to share our data with the search server and take care of update ability.
5 sins of schema.xml modifications
I made a promise and here it is – the entry on the most common mistakes when designing Solr index, which is when You create or modify the schema.xml file for Your system implementation. Feel free to read on 😉
The scope of Solr faceting
Faceting is one of the ways to categorize the content found in the process of information retrieval. In case of Solr this is the division of set of documents on the basis of certain criteria: content of individual fields, queries or on the basis of compartments or dates. In today’s entry I will try to some scope on the possibility of using the faceting mechanism, both currently available in Solr 1.4.1, as well as what will be available in the future.
What is schema.xml?
One of the configuration files that describe each implementation Solr is schema.xml file. It describes one of the most important things of the implementation – the structure of the data index. The information contained in this file allow you to control how Solr behaves when indexing the data, or when making queries. Schema.xml is not only the very structure of the index, is also detailed information about data types that have a large influence on the behavior Solr, and usually are treated with neglect. This entry will try to bring some insight about schema.xml.
6 deadly sins in the context of query
In my work related to Lucene and Solr I have seen various queries. While in the case of Lucene, developer usually knows what he/she wants to achieve and use more or less optimal solution, but when it comes to Solr it is not always like this. Solr is a product which could theoretically be used by everyone, both the person who knows Java, one that does not have a broad and specialized technical knowledge, as well as programmer. Precisely because of that Solr is a product which is easy to run and use it, at least when it comes to simple functionalities. I suppose, that is why not many people are worried about reading Solr wiki or at least review the mailing list. As a result, sooner or later people tend to make mistakes. Those errors arise from various shortcomings – lack of knowledge about Solr, lack of skills, lack of experience or simply a lack of time and tight deadlines. Today I would like to show some major mistakes when submitting queries to Solr and how to avoid those mistakes.