Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Won't Fix
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Cluster Scalability
Operating System:
ALL
Sprint:
Cluster Scalability 2024-4-1, Cluster Scalability 2024-4-15
Linked BF Score:
8

By default, the resharding transaction cloner only writes down its progress every 1000 entries. In stepdown suites, a failover is triggered every 8 seconds. In very slow variants (e.g. tsan debug), the cloner might be unable to process enough records to reach a checkpoint where progress is persisted before the next failover occurs, leaving it unable to make any progress (see BF-32013 for an example of this in practice).

We should reduce the batch size to 1 in stepdown suites to guarantee that the cloner is able to make progress, even if the system is very slow.

Assignee:: Brett Nawrocki

Reporter:: Brett Nawrocki

Participants:: Brett Nawrocki

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: Mar 26 2024 07:55:50 PM UTC

Updated:: Apr 11 2024 08:34:35 PM UTC

Resolved:: Apr 11 2024 08:34:34 PM UTC

Details

Description

Attachments

Activity

People

Dates