-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Querying, Replication
-
None
-
Environment:Master: MongoDB 3.4.5
Slave: MongoDB 3.4.10 (tried 3.4.5, the result was the same)
Storage driver: WiredTiger
Total DB size: 533 GB (/var/lib/mongodb dir size)
oplog size: 50 GB (log length start to end: 6-15 hours on our workload)
Hardware: HP DL360, CPU - 1x Xeon E5-2640 v3 @ 2.60GHz, 378 GB RAM, 4xSAS 2.5" 15K RAID 10, 1Gbit LAN.
Master: MongoDB 3.4.5 Slave: MongoDB 3.4.10 (tried 3.4.5, the result was the same) Storage driver: WiredTiger Total DB size: 533 GB (/var/lib/mongodb dir size) oplog size: 50 GB (log length start to end: 6-15 hours on our workload) Hardware: HP DL360, CPU - 1x Xeon E5-2640 v3 @ 2.60GHz, 378 GB RAM, 4xSAS 2.5" 15K RAID 10, 1Gbit LAN.
-
Replication
-
ALL
-
(copied to CRM)
There is a problem with an initial sync. Several attempts have failed with the following error:
CappedPositionLost: CollectionScan died due to position in capped collection being deleted
The capped collections size on which the errors are occured: 30 - 100 GB
On our workload the capped collections "capacity" (the time before each document is deleted) varies between 24 and 60 hours.
Here are some more detailed info about the collections:
Number of CappedPositionLost errors, collection name, capped size, capacity
6 DB1.collection1 - 40G - 2.46week
3 DB1.collection14 - 37G - min 56h
3 DB1.collection2 - 30G - min 36h
9 DB1.collection9 - 100G - min 24h
The logs for the 7 failed attempts to perform the initial sync are attached.
Currently there is only one alive instance is left in the replica set on our production system. Please help us to bring the replica up.
- depends on
-
SERVER-16049 Replicate capped collection deletes explicitly
- Closed
- is depended on by
-
TOOLS-1636 mongodump fails when capped collection position lost
- Waiting (Blocked)
- is duplicated by
-
SERVER-33652 Restarting oplog query due to error: Restarting oplog query due to error: OperationFailed: GetMore command executor error
- Closed
- related to
-
SERVER-12293 initial sync of a capped collection can often fail if highly transient
- Backlog
-
TOOLS-1636 mongodump fails when capped collection position lost
- Waiting (Blocked)