Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26593

Chunk metadata memory leak on refresh after migration commit

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.4.0-rc1
    • Affects Version/s: 3.4.0-rc0
    • Component/s: Sharding
    • Fully Compatible
    • ALL
    • Sharding 2016-10-31

      A node undergoing active balancing was observed to accumulate excess allocated memory at a rate of about 2 GB per day for several days until the node hit OOM. The initial part of a run with the heap profiling enabled shows the following stacks, all in _refreshMetadata, responsible for the bulk of the allocated memory.

      heapProfile stack888: { 0: "tc_new", 1: "std::pair<std::_Rb_tree_iterator<std::pair<mongo::BSONObj const, mongo::BSONObj> >, bool> std::_Rb_tree<mongo::BSONObj, std::pair<mongo::BSONObj const...", 2: "mongo::CollectionMetadata::fillRanges", 3: "mongo::MetadataLoader::initChunks", 4: "mongo::MetadataLoader::makeCollectionMetadata", 5: "mongo::ShardingState::_refreshMetadata", 6: "mongo::ShardingState::refreshMetadataNow", 7: "mongo::MigrationSourceManager::MigrationSourceManager", 8: "0x55abc2eabea1", 9: "0x55abc2eaddeb", 10: "mongo::Command::run", 11: "mongo::Command::execCommand", 12: "mongo::runCommands", 13: "mongo::assembleResponse", 14: "mongo::ServiceEntryPointMongod::_sessionLoop", 15: "0x55abc261ce80", 16: "0x55abc326839a", 17: "0x7fd9de1acdc5", 18: "clone" }
      heapProfile stack887: { 0: "tc_new", 1: "mongo::ConfigDiffTracker<mongo::BSONObj>::calculateConfigDiff", 2: "mongo::MetadataLoader::initChunks", 3: "mongo::MetadataLoader::makeCollectionMetadata", 4: "mongo::ShardingState::_refreshMetadata", 5: "mongo::ShardingState::refreshMetadataNow", 6: "mongo::MigrationSourceManager::MigrationSourceManager", 7: "0x55abc2eabea1", 8: "0x55abc2eaddeb", 9: "mongo::Command::run", 10: "mongo::Command::execCommand", 11: "mongo::runCommands", 12: "mongo::assembleResponse", 13: "mongo::ServiceEntryPointMongod::_sessionLoop", 14: "0x55abc261ce80", 15: "0x55abc326839a", 16: "0x7fd9de1acdc5", 17: "clone" }
      heapProfile stack869: { 0: "tc_malloc", 1: "mongo::mongoMalloc", 2: "mongo::BSONObj::copy", 3: "mongo::BSONObj::getOwned", 4: "mongo::ChunkRange::fromBSON", 5: "mongo::ChunkType::fromBSON", 6: "mongo::ShardingCatalogClientImpl::getChunks", 7: "mongo::MetadataLoader::initChunks", 8: "mongo::MetadataLoader::makeCollectionMetadata", 9: "mongo::ShardingState::_refreshMetadata", 10: "mongo::ShardingState::refreshMetadataNow", 11: "mongo::MigrationSourceManager::MigrationSourceManager", 12: "0x55abc2eabea1", 13: "0x55abc2eaddeb", 14: "mongo::Command::run", 15: "mongo::Command::execCommand", 16: "mongo::runCommands", 17: "mongo::assembleResponse", 18: "mongo::ServiceEntryPointMongod::_sessionLoop", 19: "0x55abc261ce80", 20: "0x55abc326839a", 21: "0x7fd9de1acdc5", 22: "clone" }
      heapProfile stack872: { 0: "tc_malloc", 1: "mongo::mongoMalloc", 2: "mongo::BSONObj::copy", 3: "mongo::BSONObj::getOwned", 4: "mongo::ChunkRange::fromBSON", 5: "mongo::ChunkType::fromBSON", 6: "mongo::ShardingCatalogClientImpl::getChunks", 7: "mongo::MetadataLoader::initChunks", 8: "mongo::MetadataLoader::makeCollectionMetadata", 9: "mongo::ShardingState::_refreshMetadata", 10: "mongo::ShardingState::refreshMetadataNow", 11: "mongo::MigrationSourceManager::MigrationSourceManager", 12: "0x55abc2eabea1", 13: "0x55abc2eaddeb", 14: "mongo::Command::run", 15: "mongo::Command::execCommand", 16: "mongo::runCommands", 17: "mongo::assembleResponse", 18: "mongo::ServiceEntryPointMongod::_sessionLoop", 19: "0x55abc261ce80", 20: "0x55abc326839a", 21: "0x7fd9de1acdc5", 22: "clone" }
      

        1. sharding.png
          136 kB
          Bruce Lucas

            Assignee:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: