In BF-24403, we observed a system failure during the rollback_to_stable calll with the following error message:
rollback_to_stable illegal with active transactions
It is a requirement on calling the WT_CONNECTION::rollback_to_stable API that there are no concurrent transactions. WiredTiger works to ensure internal operations don't trigger this failure mode, but the failure is transient presuming the concurrent operation finishes, so retrying at least once seems like it would be a reasonable thing to do.
SERVER-63989 proposes to retry the rollback_to_stable call until the system achieves a consistent state. As the application expects to handle the error and continue, it makes sense to change the return value in this scenario from EINVAL to EBUSY.
This ticket should be completed prior to or in parallel with SERVER-63989.
To be done:
- Change the return code in rollback_to_stable_check from EINVAL to EBUSY.
- Review existing rollback_to_stable tests that may be affected by this change. Run the full WiredTiger test suite prior to PR.
- A documentation update may be required.
- has to be done before
-
SERVER-63989 Retry rollback_to_stable until all concurrent operations finish
- Closed