Kubernetes has fundamentally changed the way we manage our production environments. The ability to quickly bring up infrastructure on demand is a beautiful thing, but along with it brings some complexity, especially when it comes to logging. Logging is always an important part of maintaining a solid running infrastructure, but even more so with Kubernetes. Because Kubernetes clusters are constantly being spun up, spun down, always in flux, making sure logging functions correctly is critical. With LogDNA, we make it extremely easy for you to get the job done when it comes to logging, but hey, problems do arise and sometimes getting log data from these complex environments can be a pain point.
From a support perspective, we tend to see a few common problems that tend to arise with Kubernetes deployments where one might fail to see logs coming in late or coming in at all, but first a little background on how LogDNA actually works in within the Kubernetes environment.
When you deploy LogDNA into your Kubernetes infrastructure, the agent actually runs as a pod on each of the nodes and ships logs directly from the STDOUT/STDERR of your containers and application logs (you can learn more about it here).
By default, we collect these logs from all of your name spaces. Now that you have a basic idea of how LogDNA handles your Kubernetes logs, we can dig into a few common problems that tend to arise within these deployments and the best practices in helping reduce the risk of these problems occurring.
1. Networking Issues
Because Kubernetes is a distributed system, proper networking and communication are critical. When communication on the networking level is disrupted, not only do the applications suffer but logging for these applications be interrupted as well, reducing insight into your infrastructure.
The symptoms of networking issues can look like:
- A master node not able to connect with other nodes.
- Nodes not being able to communicate with each other
- Frequent timeouts within the nodes themselves, interrupting communication between the node and the pods running on the node.
Surprisingly, this tends to be a common issue that arises and though it might strike anxiety in most, fear not that your logs are not being dropped as our LogDNA agent is robust when it comes to handling scenarios like this. This guide is a good start to ways you can dig deeper into Kubernetes networking, how you find your cluster IPs, service IPs, pod network/namespace, cluster DNS and more.
2. Deployment Configuration
Another common issue we tend to see within support is container or application logs not being picked up correctly or picked up at all. There can be multiple reasons why one might not see their logs, it most likely comes down to the configuration of your deployment.
First, make sure your logs are being written to the proper directory. By default, LogDNA’s agent reads from /var/log. Your applications may be writing logs to a different directory and keep wondering why logs are not showing up. This can be easily resolved in two ways. One way is to simply have your application write logs directly to /var/log.
Another way, which is more efficient, is to modify your YAML configuration to actively pick up that particular directory that your application is writing to and also making sure that that directly is written to STDOUT. Remember that if your container isn’t writing to STDOUT and you modify the config to read from the new directory, your application logs won’t get picked up.
For more more best practices on how to set up Kubernetes for logging, check out our webinar recap. Issues do arise within a Kubernetes deployment that if not taken care of can become bigger bottlenecks. Losing logs that give insight to issues should not be your problem. This write up was created by the LogDNA support team in an effort to help users identify, investigate and resolve some of the more common problems we see occur with customer deployments.