Solr 8.4.0 – Plugin Management

With the release of Solr 8.4.0 we’ve got a new functionality that we can use that can help us with extending Solr functionality – plugin management. Starting with the mentioned version we can install and remove plugins that are downloaded from external plugin repositories. Let’s see how it works.

Plugin Management

With the Solr 8.4.0 we didn’t only get the script itself but also the whole package of changes such as package management, class loader isolation, artifact read and write API and so on.

Let’s start from the beginning though. By default Solr comes with the package loading turned off and to be able to use that feature we need to run Solr with the enable.packges property set to true:

$ bin/solr start -c -f -Denable.packages=true

Now we can start playing around with the packages.

Package Management Basics

Let’s try using the bin/solr script and type in the following command:

$ bin/solr package 

In result we will get the following response:

Found 1 Solr nodes:
 
Solr process 20949 running on port 8983
Package Manager

./solr package add-repo  
Add a repository to Solr.

./solr package install [:]
Install a package into Solr. This copies over the artifacts from the repository into Solr's internal package store and sets up classloader for this package to be used.

./solr package deploy [:] [-y] [--update] -collections <package-name>[:] [-y] [--update] -collections  [-p param1=value1 -p param2=value2 …
Bootstraps a previously installed package into the specified collections. It the package accepts parameters for its setup commands, they can be specified (as per package documentation).

./solr package list-installed
Print a list of packages installed in Solr.

./solr package list-available
Print a list of packages available in the repositories.

./solr package list-deployed -c 
Print a list of packages deployed on a given collection.

./solr package list-deployed 
Print a list of collections on which a given package has been deployed.

./solr package undeploy  -collections 
Undeploys a package from specified collection(s)

Note: (a) Please add '-solrUrl http://host:port' parameter if needed (usually on Windows).
      (b) Please make sure that all Solr nodes are started with '-Denable.packages=true' parameter.

As you can see the number of options seems to fulfill the needs and the mechanism itself verifies if Solr node is running and it is able to check the installed packages. Let’s look into that.

Adding New Repository

To add a new repository we need to provide its name and the URL under which it is available. Of course Solr expects at least a single file to be present in the defined locations – the repository.json file. The mentioned file can look as follows (taken from the Solr source code):

[
  {
    "name": "question-answer",
    "description": "A natural language question answering plugin",
    "versions": [
      {
        "version": "1.0.0",
        "date": "2019-01-01",
        "artifacts": [
          {
            "url": "qarh-1.0.jar",
            "sig": "C9UWKkucmY3UNzqn0VLneVMe9kCbJjw7Urc76vGenoRwp32xvNn5ZIGZ7G34xZP7cVjqn/ltDlLWBZ/C3eAtuw=="
          }
        ],
        "manifest": {
          "version-constraint": "8 - 9",
          "plugins": [
            {
              "name": "request-handler",
              "setup-command": {
                "path": "/api/collections/${collection}/config",
                "payload": {"add-requesthandler": {"name": "${RH-HANDLER-PATH}", "class": "question-answer:fullstory.QARequestHandler"}},
                "method": "POST"
              },
              "uninstall-command": {
                "path": "/api/collections/${collection}/config",
                "payload": {"delete-requesthandler": "${RH-HANDLER-PATH}"},
                "method": "POST"
              },
              "verify-command": {
                "path": "/api/collections/${collection}/config/requestHandler?componentName=${RH-HANDLER-PATH}&meta=true",
                "method": "GET",
                "condition": "$['config'].['requestHandler'].['${RH-HANDLER-PATH}'].['_packageinfo_'].['version']",
                "expected": "${package-version}"
              }
            }
          ],
          "parameter-defaults": {
            "RH-HANDLER-PATH": "/mypath"
          }
        }
      }
    ]
  }
]

OK, let’s try adding the repository then. For that purpose I created a new repository at the http://repo.solr.pl using the files that I found in the Solr source code and I used the following command to add that repository to my local SolrCloud cluster:

$ bin/solr package add-repo solrpl http://repo.solr.pl

In my case the operation was successful and I got the following response on my console:

Added repository: solrpl

Warning: in the above example I used a repository that I could access without using SSL. Please don’t do that in your environment. It is not a good practice – it is susceptible to the man in the middle attacks during which the downloaded files can be replaced.

Installing and Removing Packages

After adding the repository we can list the packages that we can install and use. To list such packages we should run the following command:

$ bin/solr package list-available

The response to the above command should be similar to the following one:

Available packages:
-----
question-answer 		A natural language question answering plugin
	Version: 1.0.0

One last step before installing the package itself is adding a public key to Solr – one that will be able to verify the packages downloaded from the added repository. Where to look for such a key? It will be provided to you and I assume that some repositories can also have such. Once we will have the key we can add it to Solr by using the following command:

$ bin/solr package add-key publickey.der

After all this steps we can finally start installing the packages. It is as simple as running the following command:

$ bin/solr package install question-answer:1.0.0

The response that I got from Solr was as follows:

Posting manifest...
Posting artifacts...
Executing Package API to register this package...
Response: {"responseHeader":{
    "status":0,
    "QTime":66}}
question-answer installed.

This means that our package is now ready to be used. We just need to say which collection or collections should be allowed to use that package. We do that by sending the deploy command to Solr, for example by using the following command (I created the test collection before running this command):

$ bin/solr package deploy question-answer:1.0.0 -collections test

The response was as follows:

Executing {"add-requesthandler":{"name":"/mypath","class":"question-answer:fullstory.QARequestHandler"}} for path:/api/collections/test/config
Execute this command (y/n):
y
Executing http://localhost:8983/api/collections/test/config/requestHandler?componentName=/mypath&meta=true for collection:test
{
  "responseHeader":{
    "status":0,
    "QTime":0},
  "config":{"requestHandler":{"/mypath":{
        "name":"/mypath",
        "class":"question-answer:fullstory.QARequestHandler",
        "_packageinfo_":{
          "package":"question-answer",
          "version":"1.0.0",
          "files":["/package/question-answer/1.0.0/question-answer-request-handler-1.0.jar"],
          "manifest":"/package/question-answer/1.0.0/manifest.json",
          "manifestSHA512":"a91ab5a2c5abd53f0f72c256592c2be8b667cecb8226ac054aeed4d28aac9d743311442f2d58539bb83663a19bd1efb310aaadfd77bea458f3d475161721a114"}}}}}

Actual: 1.0.0, expected: 1.0.0
Deployed on [test] and verified package: question-answer, version: 1.0.0
Deployment successful

If everything went well and we said that we want to add the new package to our collection we are ready 🙂 We can start using it – for example by using Solr API to create new handlers if the plugin adds them.

When we no longer need a package we just need to undeploy it. For example, if we would like to remove the previously deployed package we just need to run the following command:

$ bin/solr package undeploy question-answer -collections test

In this case the response would be as follows:

Executing {"delete-requesthandler":"/mypath"} for path:/api/collections/test/config

How It Really Works

The heart of the implementation is related to class path and isolation of the plugins and the core Solr classes. The plugin mechanism assumes that any change in the files that are in the Solr classpath requires restart. The rest files can be loaded dynamically and are bound to the configuration stored in Zookeeper.

The basis of the mechanism is a so called Package Store. It is a distributed file system which keeps its data on each Solr node in the $SOLR_HOME/filestore directory and each of the files is described by metadata written in a JSON file. Of course, each file stores the checksum in its metadata for verification purposes.

On top of all of that we have an API allowing us not only to manage the whole packages, but also single files.

The API

Of course our bin/solr tool and installing the packages using it to manage packages is not everything that Solr give us. In addition to that we got the API that allows us to:

  • add files using the PUT HTTP method and the /api/cluster/files/{file_path} endpoint
  • retrieve files using the GET HTTP method and the /api/cluster/files/{file_path} endpoint
  • retrieve file metadata using the GET HTTP method and the /api/cluster/files/{file_path}?meta=true
  • retrieve files available at a given path using the GET HTTP method using the /api/cluster/files/{directory_path} endpoint

You should remember that adding a file to Solr is not only about sending it to Solr. You need to sign it using a key that will be available to Solr – we saw that already. The official Solr documentation has a very good example on how to do that: https://lucene.apache.org/solr/guide/8_5/package-manager-internals.html.

Of course managing files is not everything that Solr API allows us. We also have the option to add, remove and download the packages and their versions:

  • GET on /api/cluster/package to download the list of packages
  • PUT on /api/cluster/package to add a package
  • DELETE on /api/cluster/package to remove a package

For example to add a package to Solr we could use a command like this:

curl -XPUT 'http://localhost:8983/api/cluster/package' -H 'Content-type:application/json' -d  '{
 "add": {
  "package" : "testsolrpl",
  "version" : "1.0.0",
  "files" : [
   "/test/solrpl/1.0.0/testsolrpl.jar"
  ]
 }
}'

Security

Having the option to be flexible, being able to use hot-deploy and being able to install Solr extensions without the need of brining the whole cluster down carries limitations and security threats. Because of that remember not to add package repositories that you don’t know. Such repositories can be dangerous and can result in downloading and installing malicious code. The second thing to remember is that you shouldn’t add repositories that are not using SSL. Adding a repository that is not using SSL exposes you to man in the middle attack during which the files can be replaced on the fly leading to installing of the malicious code. And above all of that remember to keep your Solr secure no matter if you use package management or not.

Summary

The functionality of installing Solr extensions without the need of manually downloading them to each node, restarting the nodes and so on is very nice and tempting. Especially to those of us who use such extensions. However, please remember about security and the limitations of the mechanism. If we will be cautious we will have a way of extending Solr in a flexible way. Have fun 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.