Python Logging – A Tutorial for Beginners

As one of the world’s most popular programming languages, Python is used extensively in all kinds of applications. From data processing to web server hosting, Python’s simplicity and flexibility make it useful for almost any task. As with most languages, Python provides a logging system for extracting operational data from applications. This log data can help developers and operations teams troubleshoot problems and optimize for performance. The challenge is knowing what data to log, and how to log it in a way that’s useful for engineers.

This post explores the tools and best practices for logging Python applications. We’ll explain how to log Python applications, how to append data to logs, and how to use logs to debug and troubleshoot applications.

Why Log in Python?

No matter how much time, effort, or skill is devoted to developing and testing applications, there’s always room for improvement. Performance optimizations will need to be made, unexpected behaviors will crop up, and bugs will need fixing. But in order for developers to address these issues, they first need to be aware of them. Rather than wait for users to experience them firsthand, developers can take the initiative by collecting operational data directly from the application.

Logs can provide this data. By logging applications, developers can:

  • Trace the execution of an application from start to finish
  • Identify performance problems
  • Monitor for and analyze errors, failures, or crashes
  • Visualize the state of an application over time

A common practice is to use print() statements to record operational data from running applications. Besides being an inconvenience (who’s going to watch the console for new messages?), this data is lost as soon as the window is closed.

Logs, on the other hand, are much easier to collect, store, and analyze by both developers and automated systems. Logs can be forwarded to multiple destinations including files and remote servers, provide a great deal of contextual data, can be easily converted between different formats, and more.

The Basics of Logging in Python

Python includes a framework for generating and managing log messages known as the standard logging module. Like most logging frameworks, this module does more than just create events. It lets you control the contents, formatting, and ultimate destination of log events using a single API

Logging Framework Components

First, let’s look at the four key components in the standard logging module:

  • Loggers
  • Handlers
  • Filters
  • Formatters

Loggers

Loggers are objects used in application code to generate log events. They convert strings, error messages, and other objects into messages which can then be handled by the rest of the framework. In addition, they add contextual information to each event such as the date and time of the event, its severity level, and its location in the application’s source code.

The severity level indicates the event’s importance to the health of the application. For example, INFO logs record standard application behavior and have little to no impact on application behavior. CRITICAL logs, however, might record fatal errors and crashes.

Let’s instantiate a new logger and create an INFO-level event:

import logging
logger = logging.getLogger(__name__)
logger.debug("This is a standard message.")

This results in the following message:

INFO: This is a standard message.

Handlers

Handlers forward events from Loggers to outputs such as a console, file, or syslog server. Loggers can have multiple (or zero) handlers assigned to them, letting you log a single event to multiple destinations simultaneously. Unlike Loggers, handlers are rarely instantiated directly and are instead declared using a configuration file. We’ll explain configuration files in more detail in the next section.

Filters

Filters control which events are passed through to a logger or handler, and which ones are dropped. For example, you could use a filter to log INFO events to one file and CRITICAL events to another file. Filters can be used on any event field including severity, source, and message, giving you fine-grained control over how events are handled.

Formatters

Formatters control the contents, structure, layout, and formatting of log events. This lets you convert raw log data into human or machine-readable formats such as plain text, XML, JSON, or RFC 5424 for syslog.

Implementing Logging

Now, let’s take an in-depth look at using the standard logging module in a Python application.

The standard logging module can be configured using a configuration file. Configuration files offer a number of benefits over configuring the framework in code including separation from application logic, and consolidating logging details in a single location.

The standard logging module uses an INI-style syntax for its configuration files. For example, the following configuration prints all DEBUG-level and higher events to console (STDOUT) using a custom format string.

# logging.conf
[loggers]
keys=root

[handlers]
keys=consoleHandler

[formatters]
keys=customFormatter

[logger_root]
level=DEBUG
handlers=consoleHandler

[handler_consoleHandler]
level=DEBUG
class=StreamHandler
formatter=customFormatter
args=(sys.stdout,)

[formatter_customFormatter]
format=%(asctime)s | %(name)s | %(levelname)s | %(message)s

Note that we declared a single logger known as the root logger. If we have no other named loggers, any messages generated are handled automatically by the root logger. We also created a formatter called customFormatter to change the output of events before sending them to STDOUT. In this case, the custom formatter displays each event’s timestamp, module name, severity level, and message.

Next, we’ll create a simple Python script that loads the configuration and generates a debug log:

# main.py

import logging.config

logging.config.fileConfig('logging.conf')

logger = logging.getLogger(__name__)
logger.debug('This is a debug log.')

We can see the results on the console:

2018-10-03 16:33:36,874 | __main__ | DEBUG | This is a debug log.

Since we ran this as a top-level script, the module name is recorded as __main__. If we called the same method from a different module, this would automatically reflect the name of the module that the new event originated from.

Python Logging - Best Practices

Python’s logging module offers a great deal of flexibility and freedom when it comes to logging. Consider the following tips when logging your applications.

1) Log All Errors and Exceptions

Logging errors provide important contextual information for troubleshooting errors, failures, and crashes. This can include the exact error message reported by the application, the location where the error occurred, and even the user(s) that were affected by it.

For example, let’s look at a Python application that takes a user’s first name and last name as input, and uses them to generate a new user profile. Note the logger.exception() statement in the except block, which is configured to automatically include exception data.

import logging.config

logging.config.fileConfig('logging.conf')
logger = logging.getLogger(__name__)

class User:
    def __init__(self, firstname, lastname):
        self.firstname = firstname
        self.lastname = lastname
        self.username = str.lower(firstname[0] + lastname)

try:
    firstname = input("Enter your first name: ")
    lastname = input("Enter your last name: ")
    user = User(firstname, lastname)
    print(user.firstname + " " + user.lastname + ": " + user.username)

except Exception as e:
    logger.exception("An error has occurred.", exc_info=True)

Entering both a first name and last name gives us the expected results:

Enter your first name: Joe
Enter your last name: Smith
Joe Smith: jsmith

However, entering a blank first name causes the application to crash. Fortunately, the logs indicate the exact source and reason behind the problem:

Enter your first name: 
Enter your last name: Smith
2018-10-15 14:28:22,590 | __main__ | ERROR | An exception has occurred.

Traceback (most recent call last):

  File "/home/logdna/PythonLogging/main.py", line 18, in <module>
    user = User(firstname, lastname)

  File "/home/logdna/PythonLogging/main.py", line 12, in __init__
    self.username = str.lower(firstname[0] + lastname)

IndexError: string index out of range

2) Avoid Logging Sensitive Information

Having detailed log messages is important, but certain types of data should never be logged. This includes sensitive business and user data such as:

  • Blocks of source code
  • Credentials, access tokens, or encryption keys
  • Unique session identifiers
  • Personally identifiable information (PII) such as names, contact information, or health
  • Financial data
  • Data that doesn’t comply with GDPR, PCI DSS, or other regulations

However, this doesn’t mean all unique data is off-limits. You can safely log the following data:

  • Non-sensitive user data such as usernames or user IDs. This helps with troubleshooting errors reported by users since it links events to users and sessions without exposing personal information.
  • Class names, method names, and line numbers. While logging blocks of source code is unsafe, logging data that points developers to specific areas of source code is much less risky.
  • Exception messages and stack traces. Exceptions contain valuable information about the causes of application errors, while stack traces help developers pinpoint the source of errors in source code.

3) Use a Structured Format

Unstructured logs are easy for humans to read but difficult for machines to read. Storing logs in an unstructured format makes it difficult for logging services like LogDNA to parse, analyze and index events. The only solution is to create complex parsing rules which are both time-consuming to create and prone to errors.

On the other hand, structured formats like JSON are much easier for machines to parse, much more flexible than unstructured formats, and can store more fields with less effort. Although Python’s standard logging module doesn’t natively support JSON, you can add support using a library such as python-json-logger. Enabling JSON support is as adding a new formatter:

[formatters] keys=customFormatter [formatter_customFormatter] format=%(message)s class=pythonjsonlogger.jsonlogger.JsonFormatter

The resulting JSON can then be sent to any handler, including a file or syslog handler:

{"message": "An exception has occurred.", 
"exc_info": "Traceback (most recent call last):
\n   File \"/home/debian/PycharmProjects/PythonLogging/main.py\", 
line 18, in <module>\n user = User(firstname, lastname)\n   
File \"/home/debian/PycharmProjects/PythonLogging/main.py\", 
line 12, in __init__\n self.username = str.lower(firstname[0] + lastname)\n 
IndexError: string index out of range"}

4) Log to a Central Location

Logging to console or to a file might be fine for local debugging environments, but it does not scale to testing or production environments. These require more robust logging strategies that factor in robustness, availability, and throughput. This is even more important for platforms like Docker and Kubernetes, where applications run on multiple machines and are destroyed once they’re finished.

To prevent data loss, send your application logs to a centralized location. Services like LogDNA provide a catch-all for applications and system logs, letting you aggregate log data from across your infrastructure. This lets you access, monitor, and search through log data much more easily than having to traverse log files or console windows.

In addition, platforms like Docker and Kubernetes provide their own form of centralized logging with the use of logging drivers. Applications print messages to STDOUT or STDERR, and these platforms automatically send each message to its logging driver. The driver can then forward these events to a file on the host, to a remote syslog server, or to a centralization service. These platforms also append their own metadata such as the container name, hostname, and/or Pod name, making it easier for engineers to trace events back to their source.

What's Next?

This post introduced the basics of logging in Python. The standard logging module provides far more functionality than that covered in this post. There are also various logging strategies and best practices, including secure logging and logging microservices. This information will help you get started, but there’s more potential that has yet to be unlocked from your log data. The official Python documentation contains more information on how to configure the logging module.

Once you’ve configured logging for your Python application, consider sending your logs to LogDNA. LogDNA lets you aggregate, centralize, and search through your logs much more easily than with a terminal or text editor. You can even backup and archive your logs to Amazon S3 or another destination. To learn more or sign up for a free account, visit https://logdna.com.

Ready to get started?

Get connected with one of our technical solutions experts. We can create a custom solution to solve your logging needs.

Get Started