Today Apache Lucene and Solr PMC announced the release of 4.0 alpha version of Apache Lucene library and Apache Solr search server. When comparing to the 3.6 there were some major changes introduced about which you can read in the rest of the post.
Some of the changes in version 4.0 alpha compared to 3.6:
- Lucene
- Similarity calculation implementation has been changes. In addtion to that new Similarity models have been introduced, for example BM25.
- When using multiple indexing threads, IndexWriter can now flush to different segments resulting in substantial performance improvement.
- Codec API is introduced, which allows to choose (or implement your own) method of writing information to the index.
- FuzzyQuery performance has been greatly improved – from 100 to 200 times faster than the previous implementation.
- New SpellChecker implementation was introduced – DirectSpellChecker, which doesn’t require its own index.
- Added index statistics which allow one to check some information, like number of documents with a posting for a given field.
- New query type was introduced -AutomatonQuery which returns all documents with at least one term returned by the finite-state automaton.
- And many, many more…
- Solr
- Solr now contains SolrCloud which basicaly enables distributed indexing and searching with the use of Apache Solr. More information can be found on the following web sites: http://wiki.apache.org/solr/SolrCloud and http://blog.sematext.com/2012/02/01/solrcloud-distributed-realtime-search/
- Transaction log has been introduced, which ensures that you’ll never loose your indexed documents.
- Real-time Get has been introduced, which allows to retrieve documents without the need of commit or new Searcher opening (more information).
- DirectSolrSpellChecker has been introduced – a new SpellChecker that doesn’t require a separate index (more information).
- New admin GUI with support for SolrCloud.
- Ability to change existing document fields without the need of reindexing the whole document – so called Atomic updates.
- Ability to alter fields during query (more information).
- And many more…
The full list of changes in Apache Lucene 4.0 alpha is available at the following URL address: http://wiki.apache.org/lucene-java/ReleaseNote40alpha. The full list of changes in Apache Solr 3.6 can be found at the following URL address: http://wiki.apache.org/solr/ReleaseNote40alpha.
Apache Lucene 4.0 alpha library can be downloaded from the following address: http://www.apache.org/dyn/closer.cgi/lucene/java/. Apache Solr 4.0 alpha can be downloaded at the following URL address: http://www.apache.org/dyn/closer.cgi/lucene/solr/. Please remember that the mirrors are just starting to update so not all of them will contain the 4.0 alpha version of Lucene and Solr.