Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Sharding NYC
Operating System:
ALL
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

While investigating https://jira.mongodb.org/browse/PERF-3818 have identified a regression under mongo::OpObserverRegistry::onUpdate (mongo::OpObserverRegistry::onUpdate: 10.17% in 7.0 but 5.11% in 6.0 while 6.0 containing fewer samples). Partly it's due to adding a new observer (QueryAnalysisOpObserver), partly because more work done in ClusterServerParameterOpObserver and OpObserverImpl, and partly due to an increased overhead across all observers.

Should look into:

what is the shared overhead? (could it be due to replacing cached nss with virtual coll->ns() call?)
could any of the observers be trimmed or skipped in a more efficient way for updateMany?

I've added local std::chrono-based logging in the loop of OpObserverRegistry::onUpdate and saw the following:

v6.0	Observer	v7.0	Observer	Comment
100.64	AuthOpObserver	173.72	AuthOpObserver	Calls 'AuthorizationManagerImpl::logOp' and 'audit::logUpdateOperation'
53.1	ClusterServerParameterOpObserver	105.66	ClusterServerParameterOpObserver	Moved updateParameter to recoveryUnit
53.86	FcvOpObserver	64.9	FcvOpObserver	No changes between releases. Does little.
447.16	OpObserverImpl	618.46	OpObserverImpl	Multiple changes
55.2	PrimaryOnlyServiceOpObserver	47.24	PrimaryOnlyServiceOpObserver	Does nothing on update in both releases
56.4	ShardSplitDonorOpObserver	111.78	QueryAnalysisOpObserver	New in 7.0. The design seems wrong for updateMulti
59.38	TenantMigrationDonorOpObserver	70.24	TenantMigrationDonorOpObserver	Checks whether nss is the one it cares about and does nothing if not
58.66	TenantMigrationRecipientOpObserver	56.66	TenantMigrationRecipientOpObserver	Checks whether nss is the one it cares about and does nothing if not
93.78	UserWriteBlockModeOpObserver	79.92	UserWriteBlockModeOpObserver	Calls '_checkWriteAllowed()' then checks whether nss is the one it cases about and does nothing if not
978.18		1328.58

The numbers are nanoseconds per updated doc, averaged across updateMany that touched 50K docs. The last row is the sum across all observers. In my local tests the sum for 6.0 fluctuated between 900 and 1000. Notice, that even the observers that haven't changed and do very little in the scenario of this benchmark (essentially, a guard check or two before bailing out), such as TenantMigrationDonorOpObserver still are generally a little slower in 7.0. More contention on the instruction cache, etc.?

depends on

SERVER-77758 Break down query analysis op observer depending on cluster role.

Closed

SERVER-77364 Speed up OpObservers with a filter framework in OpObserverRegistry

Closed

related to

SERVER-77373 Use OpStateAccumulator's to cache common state

Backlog

SERVER-77364 Speed up OpObservers with a filter framework in OpObserverRegistry

Closed

Assignee:: Kshitij Gupta (Inactive)
Reporter:: Irina Yatsenko (Inactive)
Participants:: Irina Yatsenko, Kshitij Gupta, Matt Kneiser
Votes:: 0 Vote for this issue
Watchers:: 18 Start watching this issue

Created:: May 11 2023 09:56:01 PM UTC
Updated:: Jul 27 2023 08:29:02 PM UTC
Resolved:: Jul 27 2023 08:29:02 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates