Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26788

Running MongoDB on machines with multiple physical cpus

    • Type: Icon: Question Question
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.10
    • Component/s: Concurrency
    • None

      Hey everyone, we have a production replica set consisting of three nodes that has been running well for two years. It looks like this from a configuration standpoint:

      • 3 m4.2xlarge instances (8 cores)
      • xfs filesystem for data drives
      • ebs volumes (with 8000 provisioned iops – max for instance type)
      • wired tiger storage engine
      • ssl, auth, etc

      The performance is great but we want to scale up our nodes to handle a potential spike in usage over the next two weeks. As we are not particularly i/o bound given our usage of MongoDB and appear to be largely cpu bound on these boxes (from what I can tell) we have transitioned these nodes from m4.2xlarge (8 cores) to m4.4xlarge (16 cores).

      To my surprise it appears as though mongod is only using the first 8 (0-7) of the 16 cores available on this machine. Now, I realize that:

      • In going from 8 to 16 cores we may now have two physical cpus backing our instance
      • taskset and cpuset can be used to set core/processor affinity and I do not believe they are in use (we are using the init script from the Amazon linux package)
      • numactl should specify that memory usage be interleaved instead of preferring a node or physical cpu (again, confirmed via package init script)
      • Using `htop` as a view onto cpu usage on virtualized hardware is a potentially flawed metric for various reasons

      So I have the question: Why is mongod appearing to use only 8 of 16 cores available on these boxen?

      It's possible that the linux scheduler doesn't bother scheduling tasks onto the second physical cpu until there are a greater number of running threads so as to take advantage of CPU caches? Right now there is not much load to speak of so that is my current running theory.

      Having never run mognod on a machine having multiple physical cpus in production before I'm only guessing as to what the issue may be. Any clues as to what I might be seeing 10geneers?

            Assignee:
            geert.bosch@mongodb.com Geert Bosch
            Reporter:
            tyler.brock@gmail.com Tyler Brock
            Votes:
            0 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: