Understanding your application through logs and log analysis: Theory – Part 1

Adrian Simionescu | 29.04.2020
Reading time 7 min

Introduction

Welcome to an introductory post on understanding your application and system through logging. We will look at some important logging details and measures.

Let’s start with what logging is

By logging usually is meant created records through a piece of software at your operating system level, a web application, a mobile app, a load balancer, a database, a mail server and so on. Logs are created by many different types of sources.

Their importance comes from their ability to allow you to understand what is happening in your system or application. They should show you that everything is alright and if things are not, you should be able to determine where the problem is and how to fix it.

From the start of the application to the end of it, you should be able to see the entire history of the application and be able to analyze what has happened.

Purpose of logging

The general categories of logging can be summed up in the following points:

  • Security events
  • Business processes
  • Audit trails
  • Performance monitoring
  • Providing information about problems and unusual conditions

There are more categories to consider but the main points are above. Also depending on your organization or company your requirements for logging might be loose or strict. A government agency will require more specific and stricter logging than what a medium-sized private company would require.

Also depending on the severity of your application and what it does logging requirement can be stricter. For example, dealing with payments and user data is something that usually requires you to be very aware of what is happening in your application and what to log and what not to.

What to log and how much to log

Let’s start with what to log, I would consider the following but analyze your application and determine if all are relevant to you or if something is missing:

  • Input validation failures
  • Output validation failures
  • Authentication successes and failures
  • Authorization (access control) failures
  • Session management failures
  • Application errors and system events
  • Application and related systems start-ups and shut-downs, and logging initialization
  • Use of higher-risk functionality (security and user management, critical system changes, data changes, etc.)

Now this is not a complete list of what you should log but it is a good start, these suggestions are the minimum you should consider.

Now as much it is important to log and have a good view on what is happening in your system or application, it is also a fine art to understand when not to log things.

Having too much log will make it hard to find out the relevant critical information you need. Having too little logging you risk not being able to understand your problem properly.

So, there is a fine balance between logging too much or too little.

A possible solution to this issue is to have more verbose logging during development and when deploying to production your application will only log what is determined important by the developers so that someone will be able to troubleshoot a problem in production without having too much or too little logging. This is also a process that needs refactoring during the lifetime of the application.

This leads us to a requirement of logs: logs should be structured and easily indexed, filtered and searched.

What not to log

These are the things that you should want to AVOID. Having these things in a production release will be a security risk at the very least:

  • Application source code
  • Access tokens
  • Sensitive personal data and some forms of personally identifiable information
  • Authentication passwords
  • Database connection strings
  • Encryption keys and other master secrets
  • Bank account or payment cardholder data
  • Data of a higher security classification than the logging system can store
  • Information a user has opted out of collection or not consented

Now, there is a situation where you might need to log sensitive information. In this case, I would recommend “masking” or removing parts of the sensitive data to give you an idea of what the log entry is related to but without giving away the sensitive information wholly. For example:

  • File paths
  • Database connection strings
  • Internal network names and addresses
  • Non-sensitive personal data
  • Social Security Numbers

Still, be very careful with this information, especially with user-related data.

Logging audience

When you are logging, I recommend considering who are you logging for?

You need to ask yourself: Why add logging to an application?

One day, someone will read that log-entry, and it should make sense to them and be helpful. So, when you log things, think of your audience and ask yourself the following things:

  • What is the content of the message
  • Context of the message
  • Category
  • Log Level

All of these can be quite different depending on who is looking at your logs. As a developer, you can easily understand quite complex logs but as a non-developer you most likely would not be able to make much sense of complex log entries. So, adapt your language to the intended target audience, you can even dedicate separate categories for this.

Also, think if the log entries can be visualized. For example, metrics logs should have categories, dates and, numbers which can be translated into charts that show how long things last or succeed.

Write meaningful log messages

When writing log entries avoid writing them in a way that you need to have in-depth knowledge of the application internals or code logic, even if the log reader is expected to be a developer.

There are a few reasons to write log messages that are not depended on knowing the application code or the technicalities behind your application:

  • The log messages will most likely be read by someone who is not a technical person and even if they are not, you may need to prove something in your application to a non-technical person.
  • Even if you are the only developer who is working on your application, will you remember all your logic and meaning of log entries a year or two from now? If you must go to your code and check on what the heck this log entry means, then your log entry was not meaningful enough. Yes, you do have to go back to the code anyway if you have problems but if you have to do this frequently then you definitely need to refactor your logging logic and the log content in your application.
  • If you have multiple developers and they do an analysis of a problem they may not understand what is going on. This is because they might not have any correlation or understanding of a log-entry because they have not been apart of the initial solution. They must find out what is going on from the code.

Logging is about the four W:

  • When
  • Where
  • Who
  • What

Add context to your log messages

By context, I mean that your log message should usually tell what is going on by giving away all the needed details to understand what is happening.

So, this is not OK:

“An order was placed”

If you were to read that one, you would ask: “What order? Who placed the order? When did this happen?”

A much more detailed and helpful log message would be:

“Order 234123-A175 was placed by user 9849 at 29.3.2019 13:39”

This message will allow someone to get that order from a system, look at what was ordered and by whom and at what time.

Log at the proper level

When you create a log entry your log entry should have an associated level of severity and importance. The common levels that are used are the following:

  • TRACE: This log level will produce the most log entries compared to the other log levels. Use this to troubleshoot very difficult problems but never use it in production for three reasons: security, comprehension, and performance.
  • DEBUG: This is mostly used for debugging purposes during development. At this level, you want to log additional and extra information about the workings of your application that help you track down problems. This could be enabled in production if necessary, but only temporarily and to troubleshoot an issue.
  • INFO: Actions that are user-driven or system-specific like scheduled operations.
  • NOTICE: Notable events that are not considered an error.
  • WARN: Events that could potentially become an error or pose might a security risk.
  • ERROR: Error conditions that might still allow the application to continue running.
  • FATAL: This should not happen a lot in your application but if it does it usually terminates your program and you need to know why.

This is it for the theory side of things. In my next blog post, I will discuss a bit more technical things regarding logging. We will look into what kind of data is actually needed and what kind of tools you might need.