The scenario that we want to verify is that if we use a too-early afterClusterTime, then we retry on the SnapshotError. If we query after updateTS, then we still need to wait long enough that the afterClusterTime we choose has “aged out.” It’s that “aged out” part that we can’t guarantee that makes the test flaky.
By “aged out,” I mean that WT no longer allows us to query the data at that time because it’s too far in the past. The WT history is cleaned up async, but as soon as the time is advanced (like via a w-majority write), WT no longer lets you read that history even if it hasn’t been deleted yet. The part we can’t guarantee is that when we do the w-majority write, the assigned optime is far enough past updateTS to age out the pre-updateTS history. One thing we could do is after the explicit sleep(), keep doing a read until we see that the clusterTime returned by mongod has advanced far enough. That doesn’t feel much better than the PR, though. There has to be some time waiting unless we can manually adjust the timestamps inside mongod.