Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Incomplete
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.6
Component/s: Replication
Labels:
None

We have a problem with our replica set. It's running on three virtual servers and if any of the mongod's goes down, it normally continues working with the rest. However, if any of the servers totally disappears, i.e. won't respond to network traffic at all (if down, or block all outgoing traffic via firewall, or poweroff the server suddenly), all queries to the replica set take 15 seconds extra. Judging from the network traffic, it's due to TCP retransmits.

This 15 second extra time for every query makes our load balancer think all nodes are down and it shuts down traffic to the whole setup.

Since using console mongo the other replica set members works fine, we originally posted this as a bug in the node.js driver (https://jira.mongodb.org/browse/NODE-350), but later tried with the PHP driver and were able to reproduce a similar (although not identical) behaviour.

We also reproduced this problem in our secondary setup in another data center, so this shouldn't be data center specific. Both might be running the same virtualization platform, though, we haven't looked into that yet.

Any ideas how to go forward with this?

Assignee:: Andy Schwerin

Reporter:: Kalle Varisvirta

Participants:: Andy Schwerin, Kalle Varisvirta

Votes:: 1 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: Jan 23 2015 08:04:22 AM UTC

Updated:: Mar 25 2015 07:51:14 PM UTC

Resolved:: Mar 25 2015 07:51:14 PM UTC

Details

Description

Attachments

Activity

People

Dates