Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40130

Improve multi-threading

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.17
    • Component/s: Performance, Storage
    • None

      Hello, at sendinblue we use mongodb since a long time and have a few clusters running with big datasets.

      We currently struggle with our cluster using Wiretiger, slow start, replication not able to succeed or is very long.... 

      Here is information about the sizing of cluster we have here is some stats : 

      • mongodb version 3.4.17
      • 10 shards
      • databases per shard : ~ 26K
      • collections per shard : ~ 600K
      • shard size : ~ 500G
      • files in data directory for the tested shard :  1 873 414  (find /data/ | wc -l)
        • FYI we didn't split journal, data, indexes

      I have currently isolated one shard of a cluster to do some debugging about the bottlenek we encounter. I troubleshoot issues on a secondary which is currently on a google cloud  instance seems to freeze a lot while starting, play oplogs ...

      We identified that the shard looks like to have some process running as mono thread or not efficently multi-threaded. This instance run in instance of 16 vCPU at 2.5GHz and 96G of memory. 

       

      At the starting of the mongod instance it take very long time and  statistics on server seems to show that one or 2 vCPU are effectively working. 

      We have found some relative information here : https://jira.mongodb.org/browse/SERVER-27700?focusedCommentId=1480933&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-1480933

      But as we already use a version that have the improvments, we shouldn't strugle on replication op are evictions. 

      Here is the current configuration we use : 
      ```STORAGE [initandlisten] wiredtiger_open config: create,cache_size=40960M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress)```

       

      We have tried some modification of WireTiger based on this document bellow and  the  comment on the previous Jira link : https://source.wiredtiger.com/2.9.0/group__wt.html#gab435a7372679c74261cb62624d953300

      Currently my configuration is : 

      ```net:
      bindIp: 0.0.0.0
      port: XXXXX
      processManagement:
      pidFilePath: /var/run/mongodb/shard1.pid
      replication:
        oplogSizeMB: 10240
         replSetName: XXXXX
      setParameter:
        cursorTimeoutMillis: 1800000
        failIndexKeyTooLong: true
      sharding:
        clusterRole: shardsvr
      storage:
        dbPath: /data/
         engine: wiredTiger
         wiredTiger:
         engineConfig:
           cacheSizeGB: 40
      ```

      Is there any setup that permit to increase the multi processing at startup and on the replication process because seems that some process are not. 

      ```shard1-:# ps -T -p 32107
      PID SPID TTY TIME CMD
      32107 32107 ? 00:05:27 shard
      32107 32109 ? 00:00:00 signalP.gThread
      32107 32110 ? 00:00:00 Backgro.kSource
      32107 32216 ? 00:00:00 shard
      32107 32217 ? 00:00:00 shard
      32107 32218 ? 00:00:00 shard
      32107 32219 ? 00:00:24 shard
      32107 32220 ? 00:00:24 shard
      32107 32221 ? 00:00:24 shard
      32107 32222 ? 00:00:24 shard
      32107 32223 ? 00:00:02 shard
      32107 32224 ? 00:01:03 shard
      32107 32225 ? 00:00:01 WTJourn.Flusher

      ``` 

      During startup we clearly see that the server is stuck on  this with one process with 100% cpu and the rest doing nothing almost :

      ```2019-03-14T17:38:24.611+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=40960M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress),file_manager=(close_handle_minimum=10000,close_idle_time=3600,close_scan_interval=10)
      2019-03-14T17:38:42.046+0000 I STORAGE [initandlisten] WiredTiger message [1552585122:46632][32107:0x7fd0dd295d40], txn-recover: Main recovery loop: starting at 401225/10635776
      2019-03-14T17:38:42.047+0000 I STORAGE [initandlisten] WiredTiger message [1552585122:47626][32107:0x7fd0dd295d40], txn-recover: Recovering log 401225 through 401226
      2019-03-14T17:38:42.588+0000 I STORAGE [initandlisten] WiredTiger message [1552585122:588204][32107:0x7fd0dd295d40], file:collection-75452--270635807257442042.wt, txn-recover: Recovering log 401226 through 401226

       

      ls alih /data/collection-75452-270635807257442042.wt
      2274678025 rw-rr- 1 mongodb mongodb 1.2M Mar 14 17:38 /data/collection-75452-270635807257442042.wt

      free -mh
      total used free shared buffers cached
      Mem: 94G 61G 32G 32M 85M 41G
      -/+ buffers/cache: 20G 74G
      Swap: 24G 0B 24G

      ``` This step take 40min and using only 1 cpu. 

      Can you help on this ? I know you will need more info that i can probably provide.

       

      Thanks in advance

       

       

       

       

       

       

        1. iostat.log.gz
          191 kB
        2. metrics.tar.gz
          61.06 MB

            Assignee:
            eric.sedor@mongodb.com Eric Sedor
            Reporter:
            kpichardie Pichardie kévin
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: