Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Gone away
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Replication
Labels:
None

Operating System:
ALL
Sprint:
Sharding 2020-05-18

Running a sharded cluster on 4.4.0-rc3. I killed all 3 CSRS members by running a kill <pid> command. The first and third members were successfully shut down, but the second member was not able to shut down. My random guess at the most relevant error is here:

{"t":{"$date":"2020-04-28T14:26:47.203+00:00"},"s":"W", "c":"STORAGE", "id":20561,  "ctx":"SignalHandler","msg":"Error stepping down in non-command initiated shutdown path","attr":{"error":{"code":189,"codeName":"PrimarySteppedDown","errmsg":"While waiting for secondaries to catch up before stepping down, this node decided to step down for other reasons"}}}

Attaching the full logs. "member2_9008" is the process that failed to be killed.

Note – this doesn't happen every time, but I can reliably trigger again.

Spoke to judah.schvimer and he recommended filing a bug directly.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

member3_9009.log
1.15 MB
Apr 29 2020 06:45:30 PM UTC
member2_9008.log
14.21 MB
Apr 29 2020 06:45:37 PM UTC
member1_9007.log
26.63 MB
Apr 29 2020 06:45:43 PM UTC

Assignee:: Janna Golden

Reporter:: Louisa Berger

Participants:: Janna Golden, Judah Schvimer, Lingzhi Deng, Louisa Berger, Tess Avitabile

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: Apr 29 2020 06:45:44 PM UTC

Updated:: Jan 08 2024 03:23:10 PM UTC

Resolved:: May 08 2020 03:02:26 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates