Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.6.5, 4.0.0-rc0, WT3.1.0
Affects Version/s: 3.7.4
Component/s: None
Labels:
- SEDBR

Epic Link:
SPM-842
Sprint:
Storage Non-NYC 2018-04-09, Storage Non-NYC 2018-04-23
Story Points:
None

The result of which was that checkpoints could be written including data newer than the stable timestamp - i.e: data visibility rules were ignored.

Attached will be a starter set of data. It contains a fresh replica set secondary and an oplog with data generated by find9.js followed by update_multi3.js.

The `find9.js` file inserts a bunch of large documents (creates cache pressure) and `update_multi3.js` runs a sequence of oplog entries such that if the entries are replayed, will result an insert into a drop-pending namespace and causes a crash.

To reproduce, first unarchive the attached file:

tar -xzf repro_data.tgz

Then repeatedly run and control-C mongod[1].

./mongod --dbpath data/db1/ --replSet rs0 --wiredTigerCacheSizeGB 1

Note both the --replSet rs0 flag and --wiredTigerCacheSizeGB 1. The first is required to replay the oplog entries. The second I believe is required to reproduce the bug.

Keep the process up long enough to complete startup oplog recovery. The way the repro works, one run will cause corruption on shutdown and the followup run will crash while starting up.

More details on what the process does and what goes wrong:

Upon opening the datafiles, mongod will query the "recovery" timestamp of the WiredTiger checkpoint. That value then gets set as the stable timestamp.

This time is also passed back to replication which will play all oplog entries after the recovery timestamp to the top of the oplog. The next repro step is to shut down the node.

None of the data written should persist on disk as part of the stable checkpoint on close. The next startup will replay from the beginning. However, in a run that reproduces the bug, the node will be in a state where the collection metadata (stored in table:_mdb_catalog) contains the collections created (and dropped) from the previous run.

In particular what causes the problem, the dataset was created by only performing half of the two-phase drop on the update_multi3 collection. When recovery plays over the inserts again, replication recovery will crash due to trying to insert into this "drop pending" collection.

[1] Replication recovery has completed when mongod starts spamming the following log lines every half second:

2018-03-26T22:22:27.560-0400 I REPL     [replexec-2] Member localhost:27018 is now in state RS_DOWN
2018-03-26T22:22:28.060-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to localhost:27018
2018-03-26T22:22:28.060-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to localhost:27018 - HostUnreachable: Connection refused
2018-03-26T22:22:28.060-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to localhost:27018 due to failed operation on a connection
2018-03-26T22:22:28.060-0400 I REPL_HB  [replexec-0] Error in heartbeat (requestId: 7) to localhost:27018, response status: HostUnreachable: Connection refused

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro_data.tgz
362 kB
Mar 27 2018 02:25:52 AM UTC
startupRecovery.js
4 kB
Apr 05 2018 04:16:17 PM UTC

is depended on by

SERVER-34241 Remove the skipValidationNamespaces for config.transactions when WT-3998 is fixed

Closed

is related to

WT-3980 failure returning a modified update without a backing "real" update

Closed

SERVER-34075 powercycle_replication* must run replication recovery to observe canary documents

Closed

Assignee:: Susan LoVerso (Inactive)
Reporter:: Daniel Gottlieb (Inactive)
Votes:: 0 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: Mar 27 2018 02:25:23 AM UTC
Updated:: Oct 29 2023 04:46:50 PM UTC
Resolved:: Apr 17 2018 02:56:31 AM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates