-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.5.13
-
Component/s: Replication
-
Replication
-
ALL
-
v3.6
-
Repl 2017-12-04
-
15
Certain errors are not transient but don't bubble up as UnrecoverableRollbackErrors, so rollback's continuously retry logic will hit the same error every time. The error here seems to be related to new testing of applyOps, since all other commands have valid DBNames.
Please see the attached file for a repro.
d20003| 2017-10-09T18:08:53.972-0400 I REPL [rsBackgroundSync] scheduling fetcher to read remote oplog on robert-macbook-pro.local:20001 starting at filter: { ts: { $gte: Timestamp 1507586707000|12 } } d20003| 2017-10-09T18:08:53.973-0400 I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: Our last op time fetched: { ts: Timestamp 1507586707000|12, t: 1 }. source's GTE: { ts: Timestamp 1507586718000|1, t: 2 } hashes: (-3080319662268009517/5545839650998093505) d20003| 2017-10-09T18:08:53.973-0400 I REPL [rsBackgroundSync] Replication commit point: { ts: Timestamp 1507586706000|4, t: 1 } d20003| 2017-10-09T18:08:53.973-0400 I REPL [rsBackgroundSync] Rollback using the 'rollbackViaRefetch' method due to startup server parameter. d20003| 2017-10-09T18:08:53.973-0400 I REPL [rsBackgroundSync] transition to ROLLBACK d20003| 2017-10-09T18:08:53.973-0400 I NETWORK [rsBackgroundSync] Skip closing connection for connection # 35 d20003| 2017-10-09T18:08:53.973-0400 I NETWORK [rsBackgroundSync] Skip closing connection for connection # 34 d20003| 2017-10-09T18:08:53.974-0400 I NETWORK [conn77] end connection 10.4.127.25:59759 (2 connections now open) d20003| 2017-10-09T18:08:53.974-0400 I ROLLBACK [rsBackgroundSync] Starting rollback. Sync source: robert-macbook-pro.local:20001 b20001| 2017-10-09T18:08:53.975-0400 I NETWORK [main] connection accepted from 10.4.127.25:59760 #70 (1 connection now open) d20004| 2017-10-09T18:08:53.975-0400 I NETWORK [listener] connection accepted from 10.4.127.25:59761 #71 (8 connections now open) d20004| 2017-10-09T18:08:53.975-0400 I NETWORK [conn71] received client metadata from 10.4.127.25:59761 conn: { driver: { name: "MongoDB Internal Client", version: "3.5.13-325-g155db50-patch-59d98345e3c331107a001198" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "17.0.0" } } d20003| 2017-10-09T18:08:53.976-0400 I ROLLBACK [rsBackgroundSync] Finding the Common Point d20003| 2017-10-09T18:08:53.976-0400 I ROLLBACK [rsBackgroundSync] our last optime: Oct 9 18:05:07:c d20003| 2017-10-09T18:08:53.976-0400 I ROLLBACK [rsBackgroundSync] their last optime: Oct 9 18:05:19:34 d20003| 2017-10-09T18:08:53.976-0400 I ROLLBACK [rsBackgroundSync] diff in end of log times: -12 seconds d20003| 2017-10-09T18:08:53.977-0400 I ROLLBACK [rsBackgroundSync] Rollback common point is { ts: Timestamp 1507586706000|4, t: 1 } d20003| 2017-10-09T18:08:53.977-0400 I ROLLBACK [rsBackgroundSync] Starting refetching documents d20003| 2017-10-09T18:08:53.977-0400 D ROLLBACK [rsBackgroundSync] Refetching document, namespace: test.one, _id: _id: "ISKRjIN6" d20003| 2017-10-09T18:08:53.977-0400 I ROLLBACK [rsBackgroundSync] Finished refetching documents. Total size of documents refetched: 1 d20003| 2017-10-09T18:08:53.977-0400 I ROLLBACK [rsBackgroundSync] Checking the RollbackID and updating the MinValid if necessary d20003| 2017-10-09T18:08:53.977-0400 I ROLLBACK [rsBackgroundSync] Setting minvalid to { ts: Timestamp 1507586719000|52, t: 2 } d20003| 2017-10-09T18:08:53.978-0400 I ROLLBACK [rsBackgroundSync] Dropping collections to roll back create operations d20003| 2017-10-09T18:08:53.978-0400 E - [rsBackgroundSync] Assertion: 28539:need a valid database name src/mongo/db/concurrency/d_concurrency.cpp 198 d20003| 2017-10-09T18:08:53.982-0400 W ROLLBACK [rsBackgroundSync] Rollback cannot complete at this time (retrying later): Location28539: need a valid database name appliedThrough= { ts: Timestamp 1507586707000|12, t: 1 } minvalid= { ts: Timestamp 1507586719000|52, t: 2 }
- is related to
-
SERVER-31488 dropping a drop-pending collection via applyOps is a no-op
- Closed
-
SERVER-31489 Rollback of applyOps fails if embedded oplog entries don't have timestamps
- Closed
- related to
-
SERVER-31543 rollback uses non-system temporary namespaces for working collections when handling rename conflicts
- Closed