Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5312

Resync fails due to out of memory

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 2.0.3
    • Component/s: Replication
    • Environment:
      Linux 3.0.18
    • Linux

      We newly initiated a replica set, but the to-be secondary never gets out of "RECOVERING" state, as the mongod process is killed by oom-killer in the middle of resync (seemingly last step of resync- when building secondary indexes, to be precise) and start from scratch every time.

      journal is turned on, vm.overcommit_memory is set to 1, as suggested before.

      Right now, testing "echo -17 > /proc/`cat /var/run/mongodb.pid`/oom_adj" (and "swapoff -a"), but every trial takes hours.

      The data size is 10x larger than the physical memory, it seems unlikely that simply doubling the RAM would fix the problem, as the heuristics of oom-killer is rather unpredictable.

      I'd like to know what triggers this failure, and what I should keep in mind.

      What should we do to get resync done?

            Assignee:
            siddharth.singh@10gen.com siddharth.singh@10gen.com
            Reporter:
            kenn Kenn Ejima
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: