-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: 7.3.0-rc0
-
Component/s: None
-
Catalog and Routing
-
ALL
-
-
CAR Team 2024-02-19, CAR Team 2024-03-04
Context:
- A patch fails consistently on the Evergreen (a failure is not in a released version of the server nor a main branch).
- The server is tested in multi-tenant mode.
The patch content:
- Server is changed to acquire stronger tenant level lock when change stream pre-images collection is created/dropped.
- The test verifies that actually the lock is taken and insert operation on change stream pre-images collection blocks.
- There are no changes elsewhere except some instrumentation to help the investigation.
Defect symptoms:
- The test jstests/serverless/change_stream_pre_images_collection_concurrency.js in the patch triggers a failure on the secondary node https://parsley.mongodb.com/resmoke/bb0cdbc06d7a6dd9c7e43d3585b6338f/test/17ab7bfbfdf347d24d5f30a1a41ed5fa?bookmarks=0%2C321%2C12276%2C15034&filters=100d25791&selectedLineRange=L12253-L12276&shareLine=12253 at a point where an update to a test collection is applied which entails writing to the change stream pre-images collection.
- The defect is not reliably reproducible on a workstation (I was able to do that on one instance of my workstation, but failed on a new one - thus my step to pass the investigation to owners of CollectionCatalog), but seems to consistently fail on the Evergreen. Therefore it feels like a race condition defect.
- It seems that CollectionCatalog state changes between https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L229-L235 and https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L242-L244 - the change stream pre-images collection does not exist at https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L229-L235, but then appears at https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L237-L239. However, the change stream pre-images collection should be present in the CollectionCatalog at all times, since it has been replicated previously. Thus two problems: the creation of change stream pre-images collection is not visible; and CollectionCatalog state seems to change when it should not. Note that change_stream_serverless_helpers::isChangeStreamEnabled() inquires the CollectionCatalog to check it the change stream pre-images collection exists.
Hypotheses rejected:
- Application of oplog entries seem to be correct - creation of change stream pre-images collection happens in its own batch before the update operation is applied.
- is depended on by
-
SERVER-78440 Acquire tenant lock in Exclusive (X) mode when creating/dropping change stream pre-images collection
- Open