Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-78835

MongoDB truncates currentOP when it shouldn't

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 6.0.6, 5.0.18
    • Component/s: None
    • Query Execution
    • ALL
    • Hide

      I followed the same steps for 4.4.22,  5.0.18, and 6.0.6.

       

      1. Deploy a standard sharded cluster:

      • 3 shards, with 3 nodes each.
      • Config Server with 3 nodes.
      • 1 mongoS.

      2.  Enabled sharding on the following database and collection:

       

      [direct: mongos] admin> sh.enableSharding("foo")
      {
        ok: 1,
        '$clusterTime': {
          clusterTime: Timestamp({ t: 1689002196, i: 2 }),
          signature: {
            hash: Binary(Buffer.from("cf0525d509f7a3e09de5e41618bee63d5aa05ae7", "hex"), 0),
            keyId: Long("7252848566232678424")
          }
        },
        operationTime: Timestamp({ t: 1689002196, i: 1 })
      }

       

       

      [direct: mongos] admin> sh.shardCollection("foo.guests", { "_id": 1} )
      {
        collectionsharded: 'foo.guests',
        ok: 1,
        '$clusterTime': {
          clusterTime: Timestamp({ t: 1689002207, i: 14 }),
          signature: {
            hash: Binary(Buffer.from("f75203a5e0be5ebe7ffe2f62c9876cc6721aea37", "hex"), 0),
            keyId: Long("7252848566232678424")
          }
        },
        operationTime: Timestamp({ t: 1689002207, i: 10 })
      }

       

       

      3. Started to insert dummy data with mgeneratejs as follows:

       

      $ mgeneratejs '{"name": "$name", "age": "$age", "emails": {"$array": {"of": "$email", "number": 3}}, "even_numbers": {"$inc": {"start": 0, "step": 2}}, "code": {"$join": {"array": ["foo", "bar", "baz"], "sep": "-"}},"badgeId": {"$numberLong": {"min": 100000}},"created": "$now"}' -n 209980000 | mongoimport --uri mongodb://admin:sekret@localhost:6065/foo?authSource=admin --collection guests --mode insert -vvvvvv

       

      Observation:

      • Without lowering the profiling level, no truncated messages were generated, likely because they were running below 100ms.
      • But, reducing the profiling level threshold on shards, helped to track those operations and thus display the message:
      shard01 [direct: primary] foo> db.setProfilingLevel(1, {slowms: 0})
      {
        was: 0,
        slowms: 0,
        ratelimit: 1,
        sampleRate: 1,
        ok: 1,
        lastCommittedOpTime: Timestamp({ t: 1689002603, i: 3000 }),
        '$clusterTime': {
          clusterTime: Timestamp({ t: 1689002603, i: 3000 }),
          signature: {
            hash: Binary(Buffer.from("63498806c7c9e86c8fdf453128b68ac2d4edff64", "hex"), 0),
            keyId: Long("7252848566232678424")
          }
        },
        operationTime: Timestamp({ t: 1689002603, i: 3000 })
      }
      [...]
      {"t":{"$date":"2023-07-10T13:04:31.491-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254487,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.556-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":255127,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.619-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254441,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.690-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254682,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.751-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254688,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.817-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254798,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.891-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254848,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:31.967-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254746,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.031-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254557,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.092-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254712,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.155-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254519,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.222-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254656,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.286-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254613,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.354-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254958,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.418-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254862,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.486-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254983,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.549-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254639,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.611-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":255108,"limit":51200}}

      Again, this only happens on 5.0 and 6.0

      Following the same steps on 4.4 did not rise such messages.

       

      Show
      I followed the same steps for 4.4.22,  5.0.18, and 6.0.6.   1. Deploy a standard sharded cluster: 3 shards, with 3 nodes each. Config Server with 3 nodes. 1 mongoS. 2.   Enabled sharding on the following database and collection:   [direct: mongos] admin> sh.enableSharding( "foo" ) {   ok: 1,   '$clusterTime' : {     clusterTime: Timestamp({ t: 1689002196, i: 2 }),     signature: {       hash: Binary(Buffer.from( "cf0525d509f7a3e09de5e41618bee63d5aa05ae7" , "hex" ), 0),       keyId: Long ( "7252848566232678424" )     }   },   operationTime: Timestamp({ t: 1689002196, i: 1 }) }     [direct: mongos] admin> sh.shardCollection( "foo.guests" , { "_id" : 1} ) {   collectionsharded: 'foo.guests' ,   ok: 1,   '$clusterTime' : {     clusterTime: Timestamp({ t: 1689002207, i: 14 }),     signature: {       hash: Binary(Buffer.from( "f75203a5e0be5ebe7ffe2f62c9876cc6721aea37" , "hex" ), 0),       keyId: Long ( "7252848566232678424" )     }   },   operationTime: Timestamp({ t: 1689002207, i: 10 }) }     3. Started to insert dummy data with mgeneratejs as follows:   $ mgeneratejs '{ "name" : "$name" , "age" : "$age" , "emails" : { "$array" : { "of" : "$email" , "number" : 3}}, "even_numbers" : { "$inc" : { "start" : 0, "step" : 2}}, "code" : { "$join" : { "array" : [ "foo" , "bar" , "baz" ], "sep" : "-" }}, "badgeId" : { "$numberLong" : { "min" : 100000}}, "created" : "$now" }' -n 209980000 | mongoimport --uri mongodb: //admin:sekret@localhost:6065/foo?authSource=admin --collection guests --mode insert -vvvvvv   Observation : Without lowering the profiling level, no truncated messages were generated, likely because they were running below 100ms. But, reducing the profiling level threshold on shards, helped to track those operations and thus display the message: shard01 [direct: primary] foo> db.setProfilingLevel(1, {slowms: 0}) {   was: 0,   slowms: 0,   ratelimit: 1,   sampleRate: 1,   ok: 1,   lastCommittedOpTime: Timestamp({ t: 1689002603, i: 3000 }),   '$clusterTime' : {     clusterTime: Timestamp({ t: 1689002603, i: 3000 }),     signature: {       hash: Binary(Buffer.from( "63498806c7c9e86c8fdf453128b68ac2d4edff64" , "hex" ), 0),       keyId: Long ( "7252848566232678424" )     }   },   operationTime: Timestamp({ t: 1689002603, i: 3000 }) } [...] { "t" :{ "$date" : "2023-07-10T13:04:31.491-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254487, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.556-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :255127, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.619-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254441, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.690-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254682, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.751-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254688, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.817-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254798, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.891-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254848, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:31.967-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254746, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.031-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254557, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.092-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254712, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.155-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254519, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.222-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254656, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.286-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254613, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.354-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254958, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.418-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254862, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.486-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254983, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.549-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :254639, "limit" :51200}} { "t" :{ "$date" : "2023-07-10T13:04:32.611-03:00" }, "s" : "I" ,   "c" : "-" ,         "id" :4760300, "ctx" : "conn202" , "msg" : "Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated." , "attr" :{ "size" :255108, "limit" :51200}} Again, this only happens on 5.0 and 6.0 Following the same steps on 4.4 did not rise such messages.  

      While running a batch insert on a Sharded Cluster environment, we noticed a shard getting flooded with the following messages:

       

      {"t":{"$date":"2023-07-10T13:04:32.092-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254712,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.155-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254519,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.222-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254656,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.286-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254613,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.354-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254958,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.418-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254862,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.486-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254983,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.549-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":254639,"limit":51200}}
      {"t":{"$date":"2023-07-10T13:04:32.611-03:00"},"s":"I",  "c":"-",        "id":4760300, "ctx":"conn202","msg":"Gathering currentOp information, operation of size {size} exceeds the size limit of {limit} and will be truncated.","attr":{"size":255108,"limit":51200}}

       

       

      However, while analyzing the problem, we saw that since 5.0, mongo changed how currentOP would work, as documented here and{} here.

       

      Alongside that, the message itself is not very clear on the limits; Is that on bytes or kilobytes?

      I'm assuming it's reporting in bytes, following the reference here from the source code.

       

          // When the currentOp command is run, it returns a single response object containing all current
          // operations; this request will fail if the response exceeds the 16MB document limit. By
          // contrast, the $currentOp aggregation stage does not have this restriction. If 'truncateOps'
          // is true, limit the size of each op to 1000 bytes. Otherwise, do not truncate.
          const boost::optional<size_t> maxQuerySize{truncateOps, 1000};

       

       

      If that's correct.

      • Why is it limiting on such a lower threshold, should it be at least 16MB? 1000bytes per the code or  51200bytes per my tests.
        • Also wondering why such divergence. 
      • Additionally, after 5.0; the currentOP() is now built around the $currentOp via aggregation pipeline, which removes the above limitation on the max size of 16MB BSON.
        • Then, the operation itself should not fall into that condition.

       

      I've tested that on 4.4.22,  5.0.18, and 6.0.6.

      But got those messages only on 5.0.18 and 6.0.6.

            Assignee:
            bernard.gorman@mongodb.com Bernard Gorman
            Reporter:
            jean_nsilva@hotmail.com Jean da Silva
            Votes:
            3 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: