Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-55817

Request timed out while host is locked

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • None
    • ALL
    • Hide

      I am not able to reproduce the error in reliable way. My application runs about 20 of these jobs in parallel, they are all the same, just the db and collection names differ. On 5-10 jobs it fails, the others are running fine.

      Show
      I am not able to reproduce the error in reliable way. My application runs about 20 of these jobs in parallel, they are all the same, just the db and collection names differ. On 5-10 jobs it fails, the others are running fine.

      I try to setup a backup procedure according to Back Up a Sharded Cluster with File System Snapshots

      I have 3-member Config Replica Set with 4 shards. Each shard is a PSA Replica Set.

      While backup my application is running of course. I run db.fsyncLock() on one Config SECONDARY. While the config RS member is locked my aggregation pipeline fails:

      db.getSiblingDB('lau01mipmed03').getCollection('sessions#0.096-1500').aggregate([
         { '$match': { 'n': { '$ne': 'dummy' }, 't': { '$gte': ISODate('2021-04-06T13:00:00.000Z'), '$lt': ISODate('2021-04-06T19:00:00.000Z') } } },
         { '$unset': '_id' },
         { '$merge': { 'into': { 'db': 'lau01mipmed03', 'coll': 'sessions#0' } } }
      ], { 'allowDiskUse': true })
      

      Error:

      "Request 13963276 timed out, deadline was 2021-04-06T15:00:42.947+02:00, op was RemoteCommand 13963276 -- target:[d-mipmdb-cfg-01.xxx.xxxx.xxx:27019] db:config expDate:2021-04-06T15:00:42.947+02:00 cmd:{ find: \"collections\", filter: { _id: \"lau01mipmed03.sessions#0\" }, readConcern: { level: \"majority\", afterOpTime: { ts: Timestamp(1617714012, 1), t: 2 } }, limit: 1, maxTimeMS: 30000 }"
      

      Host d-mipmdb-cfg-01.xxx.xxxx.xxx is the locked Config SECONDARY.

      Looks like, the error is not related to aggregation pipeline itself, the same error appears also on other commands, e.g.

      db.getSiblingDB('t-mipmed-as-01').getCollection('sessions.096-1500.stats').aggregate([
         { '$project': { 'stats': 1 } },
         { '$unwind': '$stats' },
         { '$replaceRoot': { 'newRoot': '$stats' } },
         { '$merge': { 'into': { 'db': 'mip', 'coll': 'session.stats.raw' } } }
      ], { allowDiskUse: true })
      

       

       

       

       

       

       

            Assignee:
            edwin.zhou@mongodb.com Edwin Zhou
            Reporter:
            wernfried.domscheit@sunrise.net Wernfried Domscheit
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: