Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16312

endless "waiting for write concern"

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.4
    • Component/s: Sharding
    • None
    • ALL

      Intro

      We are running a sharded cluster of 7 shards. Every shard is a replicaset of 3 replicas.

      We do pre-splitting in order to distribute evenly new inserted documents among all shards dependent on their amount of RAM. Our database must fit in RAM else it performance drops down significantly. The balancer is switched off because it thinks all servers have the same amount of RAM which would lead to overload of shards having less RAM.

      From time to time, i.e. when adding a new shard, we need to balance manually. Our script executes first sh.moveChunk and when it returns ok, it deletes the moved documents from the source shards because we observed that sh.moveChunk does not always clean-up 100%.

      In pseudo code it gives:

      for all chunks from donor shard
        print("move " + (chunk++) + " of id: " +  chunk.min._id)
        move = sh.moveChunk(fromCollection, chunk.min._id, toShard)
        if(move.ok){
          del = fromCollection.remove({_id:{$gte:chunk.min._id, $lt:chunk.max._id}}, { writeConcern: { w: "majority"} })
          print("deleted: " + del.nRemoved)
        }
      

      Problem

      We observed that mongo may wait forever when we add the writeConcern "majority" to the remove command.

      In the currentOp we see something like this:

      {
        "opid" : 135910798,
        "active" : true,
        "secs_running" : 2362,
        "microsecs_running" : NumberLong("2362545153"),
        "op" : "query",
        "ns" : "",
        "query" : {
      	  "delete" : "offer",
      	  "deletes" : [
      		  {
      			  "q" : {
      				  "_id" : {
      					  "$gte" : NumberLong("2577460009"),
      					  "$lt" : NumberLong("2577494734")
      				  }
      			  },
      			  "limit" : 0
      		  }
      	  ],
      	  "ordered" : true,
      	  "writeConcern" : {
      		  "w" : "majority"
      	  }
        },
        "client" : "172.30.2.130:51426",
        "desc" : "conn4708952",
        "threadId" : "0x7f2035ca5700",
        "connectionId" : 4708952,
        "waitingForLock" : false,
        "msg" : "waiting for write concern",
        "numYields" : 0,
        "lockStats" : {
      	  "timeLockedMicros" : {
      		  
      	  },
      	  "timeAcquiringMicros" : {
      		  
      	  }
        }
      }
      

      While mongo was waiting, we verified the number of documents of this chunk on all replicaset members of the donor shard. It was 0! So all documents of this chunk were removed from the donor shard but mongo still waited for it. Why?

      Also, still more weird, the output of the above pseudo script was like this:

      move 1 of id: 11605025
      deleted: 5623
      move 2 of id: 183729058
      deleted: 7152
      move 3 of id: 2577460009
      deleted: 3561
      move 4 of id: 2977458123
      

      Move 4 hung forever.
      As you can see, mongo was endlessly waiting for the query $gte:2577460009. However, the result of the delete returned already since it printed the number of deleted documents (3561). Only the following chunk move #4 got stuck. Why?
      Also, as you can see, our script had always to delete documents that should have been deleted by the sh.moveChunk already, shouldn't it?

            Assignee:
            Unassigned Unassigned
            Reporter:
            kay.agahd@idealo.de Kay Agahd
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: