Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27830

TTL Monitor creates performance degradation when there are > 100k indexes

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.11
    • Component/s: Performance, TTL
    • None
    • Storage Execution
    • ALL
    • Hide

      Create a DB with > 100k namespaces and indexes. Place write of load of thousands of writes per second, scattered over all or at least a large fraction of the collections randomly.

      Observe the impact that affects the mongod process directly after the ttl.passes metric increments (passes gets incremented at the start of the TTL pass, not the end, so the work happens after rather than before it).

      Show
      Create a DB with > 100k namespaces and indexes. Place write of load of thousands of writes per second, scattered over all or at least a large fraction of the collections randomly. Observe the impact that affects the mongod process directly after the ttl.passes metric increments ( passes gets incremented at the start of the TTL pass, not the end, so the work happens after rather than before it).

      When a mongod instance has a very large number of namespaces an impact on the whole performances of the mongod can be observed when the TTL monitor is iterating them. Of course there will be impact when there are TTL-expired documents to delete, but this issue appears even when none of the indexes are TTL ones. The size of the impact will vary according to capacity of the server and the other load happening concurrently of course, but in one case with ~190k indexes delays of ~4 seconds were observed.

      It will take on the order of 100k indexes, plus having a concurrent high load, for the impact to become visible. But in that situation when the TTLMonitor thread runs it iterates through every index in the 'dbHolder' helper object to see if they have the expireAfterSeconds property, and then only if that is present will the TTL scan/delete be performed.

      In the case where there are few TTL indexes but many normal indexes this is unnecessary work. Can the 'dbHolder' class be improved to afford quick iteration of only the TTL indexes?

      Current workaround: A DBA can set the TTLMonitorEnable parameter to false if they are not using TTL at all, but it would be better if even this was not needed.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            akira.kurogane Akira Kurogane
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: