-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
None
-
Affects Version/s: 3.2.4
-
Component/s: Replication, WiredTiger
-
ALL
-
When adding a new replicaset member to the set, while syncing across a relatively slow network connection (< 200mbps), we're seeing the replication fail with a WiredTiger error "No space left on device".
However, there is substantial space left on EVERY disk on the system, including the one specifically mounted for Mongo.
/dev/sda1 880G 35G 801G 5% /
/dev/sdb2 187G 2.3G 175G 2% /mongodb
— a bunch of index creation messages, of which there are tens of thousands in the hours prior, followed by:
2016-03-31T00:19:05.902-0500 I STORAGE [rsSync] copying indexes for: { name: "IUS", options: {} } 2016-03-31T00:19:05.907-0500 I STORAGE [rsSync] copying indexes for: { name: "A2V", options: {} } 2016-03-31T00:19:06.414-0500 E STORAGE [rsSync] WiredTiger (28) [1459401546:414114][23631:0x7fdbb377f700], WT_SESSION.create: /mongodb/wt/SomeCustomer/index/47443--3141513892672868567.wt: No space left on device 2016-03-31T00:19:06.426-0500 E REPL [rsSync] 8 28: No space left on device 2016-03-31T00:19:06.426-0500 E REPL [rsSync] initial sync attempt failed, 9 attempts remaining 2016-03-31T00:19:06.608-0500 I NETWORK [initandlisten] connection accepted from 127.0.0.1:33991 #61 (4 connections now open) 2016-03-31T00:19:06.614-0500 I NETWORK [conn61] end connection 127.0.0.1:33991 (3 connections now open) 2016-03-31T00:19:08.011-0500 W FTDC [ftdc] Uncaught exception in 'FileNotOpen: Failed to open interim file /mongodb/wt/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem. 2016-03-31T00:19:11.426-0500 I REPL [rsSync] initial sync pending 2016-03-31T00:19:11.429-0500 I REPL [ReplicationExecutor] syncing from: SomeServer3:27017 2016-03-31T00:19:11.447-0500 I REPL [rsSync] initial sync drop all databases 2016-03-31T00:19:11.447-0500 I STORAGE [rsSync] dropAllDatabasesExceptLocal 73 2016-03-31T00:19:37.769-0500 I REPL [rsSync] initial sync clone all databases