Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-94026

Streams: Investigate failures like "Command... requires authentication: generic server error"

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Atlas Streams
    • Sprint 58

      We frequently observe failures in prod smoke tests like below:

      • "$merge to immortal-aws-production-virginia-usa-SP30-kanopy-data.output-kanopy_immortal_smoke_test_221b96fc_ab848ab2 failed: Command update requires authentication: generic server error"
      • Change stream $source immortal-aws-production-london-gbr-SP30-kanopy-data.input-kanopy_immortal_smoke_test_221b96fc_59e45b29 failed: Command aggregate requires authentication: generic server error

      This tends to happen in our "immortal" smoke test processors. These run forever. They are inactive for 5/10 minutes at a time and then wake up to read/write data.

      1. Are the certs being correctly rotation on the local disk?
      2. Is mongocxx using the updated cert after rotation?
      3. Do we need to configure a retry knob on mongocxx? Or do we need to add retry logic in our code?

            Assignee:
            Unassigned Unassigned
            Reporter:
            matthew.normyle@mongodb.com Matthew Normyle
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: