Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89630

Log information about failed hosts in killSessions attempt

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0, 8.0.0-rc14
    • Affects Version/s: None
    • Component/s: None
    • Service Arch
    • Minor Change
    • v8.0
    • Programmability 2024-06-24, Programmability 2024-07-08
    • 134

      When run on routers, killSessions "fans out" by forwarding the request to all shards, in addition to doing some local work.
      When one of the remote operations fails, the command is considered a failure. We record whatever remotes we failed on here: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/session/kill_sessions_common.cpp#L87-L105 and append them to the command response. However, because the router-command fails with an exception when any such failure occurs: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/commands/kill_sessions_command.cpp#L146, we discard the information we collected: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/s/commands/strategy.cpp#L1293.

      While we probably shouldn't return this information (host and port information) in the error response, we should at least log it, so the information isn't lost. Otherwise, we can't tell why the router command failed/on what node the forwarded operation failed. Similarly, we should consider recording the error we got from the remote, instead of collapsing it into HostUnreachable here: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/session/kill_sessions_common.cpp#L102 which makes the root cause less obvious

            Assignee:
            hriday.sheth@mongodb.com Hriday Sheth (Inactive)
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: