When logging applications to a centralized location like LogDNA, developers have two options: using a logging agent or using a logging library. Both approaches will get your logs to their destination, but choosing one over the other can have a significant impact on the design of your applications and infrastructure.
In this article, we’ll explain the difference between logging via agents and logging via libraries, and which approach works best in modern architectures.
What is a Logging Agent?
A logging agent (also called a log shipper) is a program that reads logs from one location and sends them to another location. They’re commonly used to read log files stored on a computer and upload individual events to a server for centralization. They essentially act as log funnels for software running on the host, including applications, services, and operating system components.
The Benefits of Logging Agents
The main benefit of agents is scalability. Since agents operate independently of the applications being logged, you can log almost any application and any number of applications using a single agent. This also lets you modify, update, or replace agents without having to take down your application or hosts. Agents use minimal resources, have few moving parts, and are easy to deploy using configuration management tools.
Agents are optimized for high throughput and high reliability. They can use techniques such as data compression, persistent connections to remote servers, and multithreading to quickly ship logs to remote services. For example, the LogDNA agent uses HTTPS to connect to LogDNA’s ingestion servers, and gzip to compress logs and reduce the total bandwidth used. Agents are also capable of detecting transmission errors and resending failed messages without blocking new logs from being sent.
Lastly, agents are format-agnostic. Regardless of how or where your applications store their logs, an agent with permission to read the log file can extract events from it. This saves developers from having to build applications with specific logging requirements, while supporting applications where the logging behavior can’t be changed (such as proprietary applications). Agents can also enrich logs by adding data that applications don’t have access to, such as environmental information, operating system data, and process data. In distributed applications especially, this data helps significantly with troubleshooting and root cause analysis.
The Drawbacks of Logging Agents
The main challenge with logging agents is that there still needs to be a way for applications to get their logs to the agent. The most common method is log files, but this requires some operational overhead for the agent to constantly access, scan, and track new log files and log lines. Not only does it requires some CPU compute usage, there’s a race condition between the time the log is generated and the time the agent reads it that results in latency. One alternative is to use a syslog service, but this adds new problems.
The increasing use of serverless and FaaS platforms pose another challenge for agents. Agents often require some amount of interaction with the host. This creates problems with platforms where the host is abstracted away, such as AWS Lambda or Heroku. In these situations, the platform usually offers its own built-in logging service or API. For AWS Cloudwatch, you can use a Lambda function with LogDNA or in Heroku, it streams logs written to standard output to a service called Logplex. You then need to use an add-on to forward those logs.
What is a Logging Library?
A logging library (or logging framework) is code that you embed into your application to create and manage log events. Logging libraries provide APIs for creating, structuring, formatting, and transmitting log events in a consistent way. Like agents, they’re used to send events from your application to a destination. The difference is that unlike agents, libraries run with your application and not separately from it.
The Benefits of Logging Libraries
Logging libraries are popular because of their portability, ubiquity, and ease of use. Libraries exist for all major programming languages and application frameworks, and most of these support most common log formats and transports. Installing a library is relatively quick, makes no permanent changes to your infrastructure, and prevents you from having to manage another service.
Logging libraries are seeing a surge with the growing popularity of serverless applications. Developers use libraries to ensure their application logs to the same destination no matter where or when their application runs. For platforms where host access is impossible, libraries offer a practical solution.
The Drawbacks of Logging Libraries
Adding a logging library to your application requires you to change your source code. You will need to add dependencies, logging statements, and configuration files to your application, increasing its size and complexity. This close integration makes it difficult to remove, swap out, or in some cases, update the library between major versions. Abstraction layers such as SLF4J try to mitigate this, but end up adding yet another API layer on top of the library’s API for the slight benefit of adding flexibility.
In addition, logging libraries often underperform compared to agents. Libraries are commonly single-threaded and synchronous by default, meaning your application must wait for the library to finish writing a log event before continuing. This can lead to noticeable delays, especially when logging over a network connection. Some libraries offer multi-threading and asynchronous logging, but this isn’t guaranteed for all programming languages or transport methods.
Lastly, libraries only run as long as your application is running. If your application shuts down or crashes, your logging solution goes with it. This makes it extremely difficult to log fatal errors or stack traces unless specifically supported by the library, and even then it’s not guaranteed.
Which Method Should I Choose?
We recommend an agent-based approach for several reasons:
Deploying an agent is much easier than deploying a library for each application. A single agent can route logs for an entire host with minimal setup and configuration. Agents also provide advanced features that libraries might not offer such as multi-threading and failure safety. Developers don’t need to add code or evaluate different solutions: they simply write their logs to a destination that the agent can read, and run the application.
Agents will almost always outperform libraries, especially as the number of applications increases. With libraries, each application runs its own instance of the library, resulting in duplicated effort. Meanwhile, a single agent instance can log for almost any number of applications while offering better performance and better log management.
LogDNA Supports Your Choice and Use Cases
LogDNA open sources our agent so you can see firsthand how our agent sends logs from your servers to LogDNA using encrypted persistent connections, data compression using gzip, automatic reconnections with no data loss, and transparent support for rotating log files. Our agent is self-updating, supports all major operating systems, and can even be deployed over an entire Kubernetes cluster with just two commands.
We also support many code libraries and you can take advantage of our REST API to integrate that with your environment. Chat with us or contact us and we can support you as you decide how you’d like to send your logs.