Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1129

File close checkpoints can erase application checkpoints.

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.3.0
    • Affects Version/s: None
    • Component/s: None

      @michaelcahill, thinking more about WT-1126, I think I've introduced problems by no longer taking the checkpoint lock when checkpointing a file during close.

      1. If we have a modified file that's not open for exclusive use (so, not a bulk file), then we won't acquire the new WT_DATA_HANDLE spin lock, and so the checkpoint's read-modify-write cycle of the file's metadata could race with the close's read-modify-write of the file's metadata. There won't be corruption, but one or the other's checkpoint information might disappear from the metadata. It's pretty unlikely (and the most likely example is a bulk-load file, where it can't happen because the new locking will kick in), but possible.

      I'm leaning toward some way to serialize checkpoints for a single file handle to fix this. We could use the new WT_DATA_HANDLE spin lock, but we'd have to move it somewhere else in the code – right now, its lock/unlock locations are narrowly targeted at avoiding closing an object when it's being used by a thread walking the open handle list, and it will be messy to use it more generally. Since we won't collide much, I don't think adding the new lock will be a big problem.

      2. Even if we serialize checkpoints on a single file handle, it's possible in the current code for a close checkpoint to discard a checkpoint created by the checkpoint API. (Hard to parse that – what I mean is if the checkpoint API creates a physical checkpoint-on-disk "A", and then close creates a new physical checkpoint-on-disk "B", "A" will be deleted if "B" has the same checkpoint name as "A". Since close always uses the standard checkpoint name, this only applies to standard checkpoints, not named checkpoints.

      I'm not sure if this is bad or not, and if bad, how bad – if this is a problem, I'm inclined to disallow dropping any old checkpoints when writing checkpoints outside of the standard checkpoint API.

      Thoughts?

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: