Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 5.1.0
Component/s: Sharding
Labels:
- sharding-nyc-subteam1

Sprint:
Sharding 2021-12-13, Sharding 2021-12-27

As part of PM-2423 we have been measuring the performance of the migration protocol. We saw some weird numbers related to the cloning of the sessions that would be interesting to understand.

Environment
3-shard Sharded cluster running 5.1 binaries.

Experiment
We create a sharded collection with an initial pre-split of 1K chunks. The shard key is a hashed random number. After that we insert as bulk of 1K documents using retryable writes.

After that we execute a few thousand random migrations. There are not CRUD operations during the execution of this phase.

You can check the Genny worload we executed here.

Results
You can find our results here.

We are plotting two different variables:

the total execution time spent holding the critical section during the migration (catch-up phase of the migration).
the total execution time spent holding the critical section blocking reads and writes (commit phase of the migration).

The interesting time is the first one, the second one is more or less constant. We can see an slow down of 30x between the first move chunks and the lasts ones on my machine, after 4K moveChunks. We also got some numbers on EVG, you see them on the different tabs.

is related to

SERVER-62233 Make SessionCatalogMigrationSource handleWriteHistory filter out oplogs outside of the chunkRange with OpType 'n'

Closed

Assignee:: Luis Osta (Inactive)

Reporter:: Sergi Mateo Bellido

Participants:: Luis Osta, Sergi Mateo Bellido

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: Nov 30 2021 04:41:51 PM UTC

Updated:: Dec 22 2021 08:14:55 PM UTC

Resolved:: Dec 22 2021 08:14:54 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates