Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-57671

Inital Sync fails when collections are renamed or dropped

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Environment:
      Mongo 4.4.3 Community Edition
      Running on Redhat Linux
    • ALL
    • Repl 2021-07-12

      I have sharded cluster for a rather big application which inserts around 15'000 document every second.

      db.getCollection('sessions').insertOne({ t: ISODate(), some more fields ...}) // 10'000-20'000 inserts per second!
      

      Once per hour I run a bucketing operation. In principle it looks like this:

      db.getCollection('sessions').renameCollection('sessions.temp');
      db.getCollection('sessions').createIndexes([{ t: 1 }], {}, 1);
      db.getCollection('sessions.temp').aggregate([
         { $group: ... }
         { $out: "sessions.temp.stats" }
      ]);
      db.getCollection('sessions.temp.stats').aggregate([
         // ...
         { $merge: { into: { db: "data", coll: "session.statistics" } } }
      ]);
      db.getCollection('sessions.temp').aggregate([
         { $group: ... }
         { $merge: { into: { db: "data", coll: "sessions.20210613" } } }
      ]);
      db.getCollection('sessions.temp.stats').drop({ writeConcern: { w: 0, wtimeout: 60000 } })
      db.getCollection('sessions.temp').drop({ writeConcern: { w: 0, wtimeout: 60000 } })
      
      

      When I drop replica set member and restart then an inital sync starts as expected. The inital sync takes around 8 hours, i.e. while inital sync is running above bucketing job runs (without any problems).

      However, when all databases are cloned (rs.status() states "databases: {databasesToClone: 0, databasesCloned: 9 ...") then I get thousands errors of

      Error applying inserts in bulk. Trying first insert as a lone insert","attr":{"groupedInserts": ...
      

      See attached log file for more details.

      I get many thousands of these errors. The disk runs out of space and MongoDB stops working!

      If I disable the hourly bucketing job then the inital sync runs without any problem. So, I assume the issue is caused by dropping/renaming, re-use of collection names, etc.

            Assignee:
            wenbin.zhu@mongodb.com Wenbin Zhu
            Reporter:
            wernfried.domscheit@sunrise.net Wernfried Domscheit
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: