Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.2.10
Component/s: Replication
Labels:
None

Assigned Teams:

Replication
Operating System:
ALL
Steps To Reproduce:

Hide

1. Create a 3-member PSA replset spread across 3 VMs, which will represent our 3 DCs. DC1 and DC2 contain databearing nodes and DC3 contains the arbiter.
2. Create a netsplit between DC1 and DC2. e.g. `sudo iptables -I INPUT -s OTHER_VM_IP -j DROP` on each of DC1 and DC2
3. View `mongod.log` on DC3 to watch primary flap between DC1 and DC2 every 10 seconds. This can also be seen via `rs.status()`.
4. Perform writes to the replset long enough so that some writes go to DC1 and others go to DC2. (e.g. >10 seconds)
5. Resolve netsplit. e.g. `sudo iptables -R INPUT 1` on DC1 and DC2
6. Either DC1 or DC2 will go into a ROLLBACK state and its writes dumped to disk.

Show
1. Create a 3-member PSA replset spread across 3 VMs, which will represent our 3 DCs. DC1 and DC2 contain databearing nodes and DC3 contains the arbiter. 2. Create a netsplit between DC1 and DC2. e.g. `sudo iptables -I INPUT -s OTHER_VM_IP -j DROP` on each of DC1 and DC2 3. View `mongod.log` on DC3 to watch primary flap between DC1 and DC2 every 10 seconds. This can also be seen via `rs.status()`. 4. Perform writes to the replset long enough so that some writes go to DC1 and others go to DC2. (e.g. >10 seconds) 5. Resolve netsplit. e.g. `sudo iptables -R INPUT 1` on DC1 and DC2 6. Either DC1 or DC2 will go into a ROLLBACK state and its writes dumped to disk.
Case:
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Under PV1 when using a PSA (or PSSSA) replset spread across three data centres, the primary node flaps between DC1 and DC2 every 10 seconds during a netsplit between DC1 and DC2. Each data centre receives roughly half the writes (assuming roughly constant write traffic). When the netsplit is resolved, the data in the non-primary data centre is rolled back.

When the netsplit occurs, the following sequence of events happen:
1. Secondary in DC2 is unable to contact a primary for 10 seconds and calls a new term.
2. The DC3 arbiter announces the new term to DC1.
3. The DC1 primary steps down.
4. Client connections are dropped.
5. The node in DC2 is elected primary.
6. Clients reconnect and find DC2 is now primary. DC2 starts accepting writes.
7. 10 seconds later, DC1 hasn’t been able to contact a primary and the process repeats itself.

Here is a snippet of logs from the arbiter demonstrating the flapping behaviour:
2016-10-19T22:49:47.655+0000 I REPL [ReplicationExecutor] Member 10.0.0.102:27018 is now in state SECONDARY
2016-10-19T22:49:47.669+0000 I REPL [ReplicationExecutor] Member 10.0.0.101:27017 is now in state PRIMARY
2016-10-19T22:49:57.672+0000 I REPL [ReplicationExecutor] Member 10.0.0.102:27017 is now in state PRIMARY
2016-10-19T22:50:02.672+0000 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 10.0.0.101:27017
2016-10-19T22:50:02.672+0000 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-10-19T22:50:02.673+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 10.0.0.101:27017
2016-10-19T22:50:02.674+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 10.0.0.101:27017
2016-10-19T22:50:02.675+0000 I REPL [ReplicationExecutor] Member 10.0.0.101:27017 is now in state SECONDARY
2016-10-19T22:50:12.676+0000 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to 10.0.0.102:27017
2016-10-19T22:50:12.676+0000 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-10-19T22:50:12.676+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 10.0.0.102:27017
2016-10-19T22:50:12.677+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 10.0.0.102:27017
2016-10-19T22:50:12.677+0000 I REPL [ReplicationExecutor] Member 10.0.0.101:27018 is now in state PRIMARY
2016-10-19T22:50:12.678+0000 I REPL [ReplicationExecutor] Member 10.0.0.102:27017 is now in state SECONDARY
2016-10-19T22:50:22.665+0000 I REPL [ReplicationExecutor] Member 10.0.0.102:27018 is now in state PRIMARY

N.B. Flapping does not occur with PSS/PV1 or PSA/PV0.

duplicates

SERVER-27125 Arbiters in pv1 should vote no in elections if they can see a healthy primary of equal or greater priority to the candidate

Closed

related to

SERVER-14539 Full consensus arbiter (i.e. uses an oplog)

Backlog

SERVER-26728 Add jstest that primary doesn't flap in PSA configuration with partition between the two data bearing nodes

Closed

SERVER-26725 Automatically reconfig pv1 replica sets using priorities or arbiters to pv0

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: James Kovacs
Participants:: [DO NOT USE] Backlog - Replication Team, James Kovacs, Spencer Brody
Votes:: 0 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: Oct 20 2016 10:12:53 PM UTC
Updated:: Dec 06 2022 04:13:43 AM UTC
Resolved:: Jan 18 2017 07:24:03 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates