-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Environment:Mongo 4.4.3 Community Edition
Running on Redhat Linux
-
ALL
-
Repl 2021-07-12
I have sharded cluster for a rather big application which inserts around 15'000 document every second.
db.getCollection('sessions').insertOne({ t: ISODate(), some more fields ...}) // 10'000-20'000 inserts per second!
Once per hour I run a bucketing operation. In principle it looks like this:
db.getCollection('sessions').renameCollection('sessions.temp'); db.getCollection('sessions').createIndexes([{ t: 1 }], {}, 1); db.getCollection('sessions.temp').aggregate([ { $group: ... } { $out: "sessions.temp.stats" } ]); db.getCollection('sessions.temp.stats').aggregate([ // ... { $merge: { into: { db: "data", coll: "session.statistics" } } } ]); db.getCollection('sessions.temp').aggregate([ { $group: ... } { $merge: { into: { db: "data", coll: "sessions.20210613" } } } ]); db.getCollection('sessions.temp.stats').drop({ writeConcern: { w: 0, wtimeout: 60000 } }) db.getCollection('sessions.temp').drop({ writeConcern: { w: 0, wtimeout: 60000 } })
When I drop replica set member and restart then an inital sync starts as expected. The inital sync takes around 8 hours, i.e. while inital sync is running above bucketing job runs (without any problems).
However, when all databases are cloned (rs.status() states "databases: {databasesToClone: 0, databasesCloned: 9 ...") then I get thousands errors of
Error applying inserts in bulk. Trying first insert as a lone insert","attr":{"groupedInserts": ...
See attached log file for more details.
I get many thousands of these errors. The disk runs out of space and MongoDB stops working!
If I disable the hourly bucketing job then the inital sync runs without any problem. So, I assume the issue is caused by dropping/renaming, re-use of collection names, etc.
- related to
-
SERVER-58164 When group insert fails, the error type is not printed in logs.
- Closed