Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8395

Inconsistent data after upgrade from 4.4.3 and 4.4.4 to 4.4.8+ and 5.0.2+

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Blocker - P1 Blocker - P1
    • 4.4.11, 5.0.6, WT11.0.0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines
    • 5
    • Storage - Ra 2021-11-15, Storage - Ra 2021-11-29, Storage - Ra 2021-12-13, Storage - Ra 2022-01-10
    • v5.0, v4.4

      Issue Status as of Feb 1, 2022

      ISSUE DESCRIPTION AND IMPACT

      This issue causes incorrect checkpoint metadata to sometimes be recorded by MongoDB versions 4.4.3 and 4.4.4. Starting in versions 4.4.8+ and 5.0.2+ WiredTiger uses that incorrect metadata at start up, which can lead to data corruption.

      Upgrading directly to any MongoDB version 4.4.8+ or 5.0.2+ from MongoDB versions 4.4.3 and 4.4.4 can leave data in an inconsistent state. This ticket currently tracks the implementation of a safe, direct upgrade path to a future version of MongoDB, and this fix is included starting in MongoDB versions 4.4.11 and 5.0.6.

      DIAGNOSIS

      This issue can cause a Duplicate Key error on startup that prevents the node from starting.

      However, nodes can also start successfully and still be impacted. If a node starts successfully, it may still have been impacted by:

      • Data inconsistency within documents - specific field values may not correctly reflect writes that were acknowledged to the application prior to the shutdown time. And, documents may still exist which should have been deleted.
      • Incomplete query results - lost or inaccurate index entries may cause incomplete query results for queries that use impacted indexes.
      • Missing documents - documents may be lost on impacted nodes.

      Impact on a node that starts successfully can be checked by running the validate command. The output from validate reveals the impact by reporting on inconsistencies found between documents and indexes in the form of:

      • Extra index entries (including duplicate entries in unique indexes)
      • Missing index entries

      REMEDIATION AND WORKAROUNDS

      For clusters still on versions 4.4.3 and 4.4.4: it is possible to avoid this issue by upgrading directly to 4.4.11+ or 5.0.6+.

      Reference the following list to consider our recommended response to this issue:

      • Clusters on versions 4.4.0, 4.4.1, and 4.4.2 are safe to upgrade to 4.4.8+ or 5.0.2+ but should upgrade to recommended versions 4.4.10+ or 5.0.4+.
      • Clusters on versions 4.4.3 or 4.4.4 should upgrade directly to versions 4.4.11+ or 5.0.6+.
      • Clusters running versions 4.4.5-4.4.7 can and should upgrade to 4.4.10+ or 5.0.4+.

      Be aware that WT-7995 affects versions 4.4.2-4.4.8 and requires its own remediation.

      For clusters that have already upgraded to 4.4.8+ from versions 4.4.3 and 4.4.4:

      • If you previously followed remediation steps for WT-7995 and detected corruption, you will have remediated any corruption that occurred as part of this bug.
      • If you have not validated all collections since upgrading to 4.4.8+ from 4.4.3 or 4.4.4, we recommend validating all collections.

      If corruption is detected, data can be recovered from other nodes in the replica set. This may be operationally intensive. See [these scripts| If an unaffected node cannot be readily identified these scripts can assist the remediation of this bug.] for assistance. Please use these scripts with care, and consult the README thoroughly before use.

            Assignee:
            haribabu.kommi@mongodb.com Haribabu Kommi
            Reporter:
            eric.sedor@mongodb.com Eric Sedor
            Votes:
            2 Vote for this issue
            Watchers:
            85 Start watching this issue

              Created:
              Updated:
              Resolved: