Using TaskExecutor::sleepFor() has the downside that if there's a primary failover, then a whole new minimumOperationDuration period must elapse before the ReshardingTxnCloners begin again. It would be more efficient to write down the Date_t of when the ReshardingTxnCloners can begin as part of the transition to RecipientStateEnum::kCreatingCollection and use sleepUtil() instead. The Date_t can be calculated from calling TaskExecutor::now() + RecipientStateMachine::_minimumOperationDuration.
for (const auto& txnCloner : _txnCloners) { txnClonerFutures.emplace_back( executor->sleepFor(minimumOperationDuration, cancelToken) .then([executor, cleanupExecutor, cancelToken, txnCloner = txnCloner.get()] { return txnCloner->run(executor, cleanupExecutor, cancelToken); }) .share()); }
Work on this ticket will also involve
- Adding a new, optional field to the ReshardingRecipientDocument IDL struct of {type: date, optional: true}.
- Adding a new, boost::optional<Date_t> _startConfigTxnCloneAt member to RecipientStateMachine and initializing it as ReshardingRecipientDocument::getStartConfigTxnCloneAt() in the constructor and assigning it the value calculated from TaskExecutor::now() + RecipientStateMachine::_minimumOperationDuration following the transition to RecipientStateEnum::kCreatingCollection.