Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-92779

TTL delete progress blocked by unowned documents

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • ALL
    • Hide

      I've also attached a jsTest. But to summarize how to reproduce the behavior:

      (1) Spin up a sharded cluster with 'disableResumableRangeDeleter: true' and 'ttlMonitorSleepSecs: 1'. The bug requires orphans or unowned documents due to chunk migration, so we disable the range deleter.

      (2) Create a sharded collection, with all chunks on shard0.

      (3) Insert at least 'TTLIndexDeleteTargetDocs' into a collection with a 'ttlField' set to the either the current time or some time in the past. These will eventually live in their own chunk on shard1.

      (4) Insert one document with ttlField set to the current time, make sure this can put in a separate chunk.

      (5)  Split the collection into 2 chunks. Move the chunk with 'TTLIndexDeleteTargetDocs' to shard1. 

      (6) Now, shard0 has 1 owned doc, and 'TTLIndexDeleteTargetDocs' orphan docs. 

      (7) Create a TTLIndex on 'ttlField' with expireAfterSeconds: 1.

      (8) The single document that is expired, but owned on shard0, never gets deleted by the TTLMonitor. 

      Show
      I've also attached a jsTest. But to summarize how to reproduce the behavior: (1) Spin up a sharded cluster with 'disableResumableRangeDeleter: true' and 'ttlMonitorSleepSecs: 1'. The bug requires orphans or unowned documents due to chunk migration, so we disable the range deleter. (2) Create a sharded collection, with all chunks on shard0. (3) Insert at least 'TTLIndexDeleteTargetDocs' into a collection with a 'ttlField' set to the either the current time or some time in the past. These will eventually live in their own chunk on shard1. (4) Insert one document with ttlField set to the current time, make sure this can put in a separate chunk. (5)  Split the collection into 2 chunks. Move the chunk with 'TTLIndexDeleteTargetDocs' to shard1.  (6) Now, shard0 has 1 owned doc, and 'TTLIndexDeleteTargetDocs' orphan docs.  (7) Create a TTLIndex on 'ttlField' with expireAfterSeconds: 1. (8) The single document that is expired, but owned on shard0, never gets deleted by the TTLMonitor. 
    • Execution Team 2024-08-19

      If there are greater than TTLIndexDeleteTargetDocs expired orphan (unowned) documents for a given TTL index, more recently expired documents cannot be removed by the TTLMonitor through the index. 

      Details
      The TTLMonitor uses batched deletes by default. The batched delete stage first 'stages' documents in a buffer until _batchTargetMet().
      The batch target is met if either the  'targetBatchDocs' are stored in the buffer, or more than 'targetStagedDocBytes' are stored in the buffer. However, documents in the buffer can be orphans. 

      Once the batch target is met, we try to commit the batch. Since the TTLMonitor doesn't remove orphans, orphan documents are 'skipped' and not issued a delete. If all staged deletes were 'successful' (or skipped), and the buffer is cleared

      If the buffer is empty, and _passStagingComplete, isEOF() is true, and the BatchedDeleteStage returns EOF. If _passTargetMet() is true, _passStagingComplete is true. _passTargetMet() is true if the total number of documents staged (this can include orphans) across batches exceeds '_batchedDeleteParams->targetPassDocs'. The TTLMonitor sets 'targetPassDocs' to TTLIndexDeleteTargetDocs.

      If there are more than TTLIndexDeleteTargetDocs that are (1) orphans and (2) expired, the TTLMonitor will repeatedly try to issue the same batch delete with no delete progress. The TTLMonitor can't recover until the orphan documents are cleaned up.

      The issue isn't specific to orphans. It can also manifest when a received chunk has expired documents, but the chunk hasn't been committed to the shard yet. 

      The issue isn't specific to orphans on a donor shard. Expired orphan documents on a recipient shard, which belong to a chunk that has yet to be committed, can also block TTL delete progress.

            Assignee:
            Unassigned Unassigned
            Reporter:
            haley.connelly@mongodb.com Haley Connelly
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated: