Running Solr on Kubernetes

Why Kubernetes?

  • Easy to deploy and manage
  • Resilient against failures (if configured properly)
  • Easy autoscaling helps save costs. Massively!
  • Cloud provider agnostic management

    Why is this hard? Why SearchStack?

  • Managing storage layer for any stateful application on Kubernetes is hard.

  • Every stateful application (like a database or search engine) requires careful administration adhering to best practices.

  • SearchStack attempts to make deploying and managing production grade search applications easy.

Container local storage (no persistent volume):

Pros: Simple
Cons: No fault tolerance

Network attached storage (e.g. EBS/EFS on AWS or PD on GCP)

Pros: Cheap; Support for Snapshots; Simple
Cons: Slower I/O
How: Neither of EBS or PD support ReadWriteMany. In order to leverage them, we can either (1) manually attach an EBS/PD to the nodes and use HostPath with ReadWriteMany, or (2) Setup an NFS server (with Rook) backed by the EBS/PD and use this in ReadWriteMany mode. The example below demonstrates the latter.

Local SSD storage:

Pros: Fast
Cons: Expensive

Notes:

If every node has an additional SSD attached as a block device, then we can setup a distributed filesystem using them using Ceph (with Rook). This is a very performant solution. However, production maintenance would require devops engineers to learn some ceph tools. We’ll add a guide for this solution shortly.

Google Cloud Platform

  1. Assuming you have an active Google Cloud Platform account, the first step is to create a project (or use an existing project).
  2. Next, go to the Kubernetes Cluster tab in the console and hit the Create Cluster button. gcp-001-kubernetes-cluster.png
  3. Use a “Standard cluster” and edit the following:
    Name: my-search-stack
    Master version: 1.13.6-gke-5
    gcp-002-standard-cluster
  4. Scrolling down further, edit the node pools to choose desired memory and vCPU settings.
    Note: This is usually based on how resource intensive your application is going to be. Having at least 3 nodes, with at least 4 vCPUs and 16GB each should suffice for most medium to large scale applications. If you want to leverage autoscaling, choose the number of nodes you wish to autoscale up to (at peak resource utilization).
    gcp-003-node-pools
  5. Choose container image as Ubuntu. gcp-004-save Note: You can choose to add Local SSD disks. For this quickstart, we’ll use a Standard Persistent Disk and you can skip this for now (have it at 0).
  6. Save the node pools and hit Create button to bootstrap this cluster.

Deploy NFS storage, Zookeeper and Solr

  1. Install the gcloud SDK and kubectl on your laptop. Do an OAuth based login for your gcloud CLI.
  2. Setup kubectl config using your project ID:

    $ gcloud container clusters get-credentials my-search-stack --zone us-central1-a --project <project-id>
    
  3. Create a Persistent Disk:

    $ gcloud compute disks create solr-pd --zone us-central1-a --size=200GB
    
  4. Check all nodes are ready:

    $ kubectl get nodes
    
    $ git clone https://github.com/searchstackio/searchstack
    
    $ cd searchstack
    
    $ kubectl apply -f rook-nfs-gcloud.yml
    
    $ kubectl apply -f zk.yml -f solr.yml
    
  5. Wait until all the pods are Running:

    $ kubectl get pods --all-namespaces
    
  6. If you see all your pods running, congratulations! You can check your Solr admin UI using a proxy:

    $kubectl port-forward pod/solr-0 8983:8983
    
  7. Open http://localhost:8983/solr on your browser.

Next Steps

  1. Storage solution using Ceph/EdgeFS (Rook)
  2. Configuring autoscaling and collection management
  3. Backups, snapshots, cross-cluster replication
  4. Rolling vs Canary deployments
  5. Distributed Tracing using Istio
  6. Metrics and Monitoring using Prometheus / Grafana / Datadog
  7. Configurable pipelines for indexing and querying into Apache Solr
  8. Integration with machine learning frameworks