By: Thu Nguyen

Read Time: 13 min

I spoke at Container World 2019 in Santa Clara and shared insights on what LogDNA has learned in scaling Elastic Search using Kubernetes over the years.

Here are some highlights from the talk and you can also find the slide deck below.

First, the basics…

What is Elasticsearch (ES) and why would I use it?

Elastic Search is the “E” in the popular ELK stack and allows easy searching of unstructured data. It is a distributed full-text search engine that is queryable using a JSON API and great for logging. It can be scaled relatively easily because it handles clustering and syncing tasks across nodes and nodes can be added relatively easily. It’s a popular choice when in the market for an out of the box solution that’s easy to get started

Why would you want to run Elasticsearch (ES) on Kubernetes (k8s)?

Kubernetes is an open source container orchestration platform developed by Google. It schedules all your workloads onto all available resources. The cloud providers also have integrations to autoscale resources like memory and you don’t have to do it by hand.  Kubernetes allows for configuration as code and static docker images enforce consistent pod behaviors across your infrastructure. You’ve been watching the Kubernetes hype train ship and want to jump on board.

Why does LogDNA use ES and k8s?

At LogDNA, we have made many modifications to the Elasticsearch interface and we’ve built in-house versions of the L (Logstash) and K (Kibana) of the ELK stack for better performance.

We needed a consistent way to deploy our software across varying infrastructures. We run our application on both cloud and on-premise and we are agnostic to wherever our customers want to run LogDNA, whether it’s Amazon, Azure, a data center in Las Vegas, a barn in Russia, anywhere.

We use Kubernetes to help us better automation for versioning, CI/CD and maintenance. We run ES on k8s at scale.

Running Elasticsearch on Kubernetes is not straight forward

These are a few of the steps involved in running ES on Kubernetes:

How LogDNA got started running ES on k8s

First, at LogDNA we started with a few sane defaults that we recommend:

We will dive deeper into:

  1. Statefulsets and Services yaml configurations (we need them for identity, disks, and networking)
  2. Basic, but important cluster settings & a good starter index template. Index templates define how data is saved to an index, knowing how to configure this has helped us increase performance by an order of magnitude.
  3. Deploy an ES cluster management GUI (cerebro) to help with troubleshooting

1. Configuration Tips

Security context settings

2. Basic Cluster Settings

Service Discovery in Kubernetes

Once you have your pods, you’ll need to worry about since ES is a distributed database, the pods need talk to each other. ES hot and cold have a single load balanced cluster IP service endpoint for insertions and query data.

ES masters are really important because they hold an election to discover each other you have to make sure you have:

What this does is to allow you to list all the available IP addresses for the pods that are in the group. Instead of getting a load balanced endpoint, all the masters can discover each other.

2 important settings for clusterIP:None

ES Startup Settings

Here’s what we use:

Configuring an index template

What’s irritating is that index templates can’t be set ahead of time. You have to go and ping the API once your ES is up and then add your index templates. We have a job that does that.

3. Managing Elasticsearch

A) Cerebro (Manage using a GUI)

Cerebro connects to your ES service endpoint(s). It contains an ES node/pod list and their health stats. You can easily view indices and shards across the available data nodes. You can modify index settings, templates, and data. Most importantly you can move shards around.

Not everything is available via Cerebro.

B) Managing ES through API calls

We use Insomnia (a REST API GUI to share API calls) though curl works too

Wrap Up

I know we’ve walked through a lot of what seems like obscure settings in Elasticsearch. When you’re running Elasticsearch in a Docker container you have to realize that it was not designed for Docker containers. It requires some coaxing to properly run inside a container.

Download Slides

Feel free to reach out and share your experience in scaling ES with K8. Instead of worrying about scaling your own log management solution, give LogDNA a try and sign up for a 14-day free trial.

About Thu Nguyen

Thu Nguyen is a technical writer who cares deeply about human relationships.


LogDNA Events in June

LogDNA at Velocity A few notable events in June that we attended were 1. Velocity Conference It was a treat to attend the Velocity and...

Cf Days Webbanner 1600x500 V1 Ac Nologo 01 1200x375

Open Ecosystems, Interoperability + Multi-Cloud: Risk & Reward for Developers Webinar

Our VP Product, Peter Cho hosted a webinar with CloudFoundry last week to share his pragmatic approach in this golden age of developer tools, on-premise...

Cloud Foundry

RSVP for our Upcoming DevOps Webinar

In our next webinar, Peter Cho, VP Product at LogDNA will be presenting: Open Ecosystems, Interoperability + Multi-Cloud: The Risks, Rewards & Trade-offs for Developers...