Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-2503

ConnectionId returned in heartbeats may be int64

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Component/s: Handshake
    • None
    • Needed
    • Hide

      The connectionId in the hello (or legacy hello) response can be an int32, double, or int64. Many drivers assume an int32, which may result in connectionId truncation or connection failure. Drivers should ensure that the server's connectionId (and the client connectionId for consistency) is expressed as a numeric type capable of holding an int64.

      NOTE: If the client and server connectionId fields are part of the driver's public API, you may have to add new int64 connectionId fields and deprecate the existing int32 fields. On the next major version bump, the deprecated int32 fields should be removed.

      Show
      The connectionId in the hello (or legacy hello) response can be an int32 , double , or int64 . Many drivers assume an int32 , which may result in connectionId truncation or connection failure. Drivers should ensure that the server's connectionId (and the client connectionId for consistency) is expressed as a numeric type capable of holding an int64 . NOTE: If the client and server connectionId fields are part of the driver's public API, you may have to add new int64 connectionId fields and deprecate the existing int32 fields. On the next major version bump, the deprecated int32 fields should be removed.
    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      CDRIVER-4557 Fixed 1.24.0
      CXX-2638 Works as Designed
      CSHARP-4483 Fixed 2.20.0
      GODRIVER-2737 Fixed 1.11.7
      JAVA-4846 Done 5.0.0
      NODE-4971 Duplicate 6.4.0
      MOTOR-1084 Duplicate
      PYTHON-3571 Works as Designed
      PHPC-2220 Fixed 1.16.0
      RUBY-3206 Backlog
      RUST-1571 Fixed 2.5.0
      SWIFT-1692 Won't Do
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } #scriptField td.willNotDo { background-color: #FF0000; /* Red color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion CDRIVER-4557 Fixed 1.24.0 CXX-2638 Works as Designed CSHARP-4483 Fixed 2.20.0 GODRIVER-2737 Fixed 1.11.7 JAVA-4846 Done 5.0.0 NODE-4971 Duplicate 6.4.0 MOTOR-1084 Duplicate PYTHON-3571 Works as Designed PHPC-2220 Fixed 1.16.0 RUBY-3206 Backlog RUST-1571 Fixed 2.5.0 SWIFT-1692 Won't Do

      Summary

      In the hello response, the server will return connectionId as an int32, int64, or even double. Many drivers (and our specs) assume that it is an int32. This can result in connection failures in some drivers (e.g. .NET/C#) or truncation of the connectionId (Java).

      Motivation

      Who is the affected end user?

      End users with long-running clusters and high connection churn.

      How does this affect the end user?

      Depends on how the driver handles the overflow. Some will throw an exception. Others will truncate the connectionId. Worst case scenario, the user will be unable to connect to the MongoDB cluster.

      Note that some drivers like Node.js and Python aren't affected by this bug because they use arbitrary-width numeric types.

      How likely is it that this problem or use case will occur?

      This issue is relatively infrequent. To overflow an int32 connectionId, the server would have to churn 100 connections per second for 8 months at a sustained rate. If the churn rate was higher, then the time to overflow would be proportionately shorter. This is mitigated by the fact that the server's connectionId counter is reset with every server restart.

      If the problem does occur, what are the consequences and how severe are they?

      Some drivers (like .NET/C#) won't be able to connect until the affected server is restarted. Others like Java will simply truncate the connectionId making it difficult/impossible to correlate client and server logs. Others like Python and Node.js are unaffected by this bug.

      Is this issue urgent?

      Given that restarting the affected server resolves the issue for multiple months even at high connection churn rates, this issue does not appear to be urgent at this time.

      Is this ticket required by a downstream team?

      No.

      Is this ticket only for tests?

      No.
      Does this ticket have any functional impact, or is it just test improvements?

            Assignee:
            james.kovacs@mongodb.com James Kovacs
            Reporter:
            james.kovacs@mongodb.com James Kovacs
            Esha Bhargava Esha Bhargava
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: