Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.4.0-rc11, 4.7.0
Affects Version/s: 4.4.0-rc6
Component/s: Diagnostics, Networking
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4
Linked BF Score:
200
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

I ran the repro script in ~~SERVER-48395~~ while sending mongod a SIGUSR2 every 10 seconds to collect stack traces. This resulted in network errors and failed operations at the application

Error: error doing query: failed: network error while attempting to run command 'insert' on host '127.0.0.1:27017'  :

accompanied by messages from the mongo shell like

{"t":{"$date":"2020-05-25T15:07:29.723Z"},"s":"I",  "c":"NETWORK",  "id":20120,   "ctx":"js","msg":"Trying to reconnnect","attr":{"connString":"127.0.0.1:27017 failed"}}
{"t":{"$date":"2020-05-25T15:07:29.724Z"},"s":"I",  "c":"NETWORK",  "id":20125,   "ctx":"js","msg":"DBClientConnection failed to receive message","attr":{"connString":"127.0.0.1:27017","error":"HostUnreachable: Connection closed by peer"}}

The mongod logs tell a similar story:

The blue markers on the timeline show the points at which SIGUSR2 was received. These are accompanied by some number of connections ended (red curve) and a smaller number of connections accepted (blue curve), resulting in a net decrease of connections each time (green curve)

I wonder if we might not be re-trying network operations when they return the EINTR that would result from SIGUSR2.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

sigusr2.png
79 kB
May 25 2020 03:53:50 PM UTC
db.log
1.85 MB
May 25 2020 03:55:43 PM UTC

is related to

SERVER-47229 Make TransportSessionASIO cancelation level triggered

Closed

related to

SERVER-48395 Extended stalls during heavy insert workload

Closed

SERVER-33445 Add signal handler to generate stack traces

Closed

Assignee:: Billy Donahue

Reporter:: Bruce Lucas (Inactive)

Participants:: Benjamin Caimano, Billy Donahue, Bruce Lucas, Githook User

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: May 25 2020 03:54:24 PM UTC

Updated:: Dec 13 2024 02:29:13 PM UTC

Resolved:: May 30 2020 06:05:58 PM UTC

Confidence Status Last Update:: 26/May/20 11:07 PM

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates