Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-2556

Add preferred structured log serialization format to the Logging spec

    • Type: Icon: Spec Change Spec Change
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Component/s: Logging
    • None

      Summary

      The Logging spec describes two main ways that drivers should emit logs:

      1. Native language integration with logging standards using idiomatic APIs.
      2. STDOUT/file logger that can be configured with environment variables.

      To support case #2, drivers in language ecosystems without strong logging standards may need to implement their own log message serialization logic. The Logging spec describes the expected unstructured log format, but doesn't describe an expected or preferred serialization format for structured logs. For drivers teams that must implement the structured log serialization logic, it would be helpful to specify a preferred format, including:

      • Serialization format (e.g. Extended JSON)
      • Timestamp field name (e.g. "t")
      • Timestamp format (e.g. Unix nanoseconds)
      • Nesting of command/reply Extended JSON document (e.g. nest as string or nest as JSON document)

      We should also provide examples of drivers log messages in the preferred structured format.

      Considering formats, the primary consumer for the STDOUT/file log output configured via environment variable is support engineers, who likely have existing toolchains for parsing server logs, which are serialized as relaxed Extended JSON. Here's an example log from MongoDB v6.0.4:

      {
          "t": {"$date": "2023-02-17T15:24:50.699-08:00"},
          "s": "W",
          "c": "CONTROL",
          "id": 22184,
          "ctx": "initandlisten",
          "msg": "Soft rlimits for open file descriptors too low",
          "attr": {
              "currentValue": 20000,
              "recommendedMinimum": 64000
          },
          "tags": ["startupWarnings"]
      }
      

      Following that mongod log pattern, I propose the following preference:

      • Preferred serialization format is relaxed Extended JSON
      • Preferred timestamp field name is "t"
      • Preferred timestamp format is the default for relaxed Extended JSON
      • Preferred nesting format for command/reply documents is as strings (to guarantee that the top-level log document is always valid JSON)

      The spec can still allow drivers teams to pick an idiomatic serialization format. However, if there is no obvious idiom to follow, drivers teams should use the preferred structured log serialization format.

      TLDR

      We should add a section to the Logging spec that says, "When you're implementing the STDOUT/file logger that's enabled by environment variables, if there's no standard structured log serialization format for your language ecosystem, here's the preferred format:

      ... preferred format here ...

      Otherwise, use whatever is idiomatic for your language ecosystem."

      Motivation

      Who is the affected end user?

      Driver devs implementing the logging specification.

      How does this affect the end user?

      They are confused because they don't have clear guidance on how to serialize structured log messages.

      How likely is it that this problem or use case will occur?

      Any driver dev working in a language ecosystem without strong logging standards will likely have to make several decisions about how to serialize structured log message for writing to STDOUT or a file.

      If the problem does occur, what are the consequences and how severe are they?

      Drivers teams may select different serialization formats. Depending on what serialization format decisions the drivers teams make, the logs may be more or less useful to support engineers. Also, the reduction in alignment between support engineers' log analysis toolchains and drivers log formats may make support engineers less effective.

      Is this issue urgent?

      No.

      Is this ticket required by a downstream team?

      No.

      Is this ticket only for tests?

      No.

            Assignee:
            boris.dogadov@mongodb.com Boris Dogadov
            Reporter:
            matt.dale@mongodb.com Matt Dale
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: