-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 8.1.0-rc0, 5.0.29, 6.0.18, 7.0.14, 8.0.0
-
Component/s: None
-
Query Execution
-
ALL
-
v8.0, v7.0, v6.0, v5.0
-
0
The agg_merge_when_not_matched_insert.js test is performing moveChunk operations at the same time we insert documents on the same collection. All the documents are being inserted on the same chunk and so we've seen a few failures due to range deletions exceeding the timeout.
More detailed explanation
The concurrency test starts with a sharded collection called 'agg_merge_when_no_matched_insert' with the shard key {_id: 1}. This collection is initially partitioned into 2 ranges (one per thread): [Min, 50),[50, Max).
There are two stages: moveChunk and aggregate. The first one moves a chunk from one shard to another, and the second one inserts 100 new documents to the 'agg_merge_when_no_matched_insert' collection through a $merge aggregation.
These inserted documents add an Object to the _id field, meaning that all the inserted documents will fall on the same chunk [50, Max) since a number is always lower than an object type.
Proposal
To avoid exceeding the migration timeout due to slow range deletions on the range [Min, 50), I suggest:
- Modify the $merge operation to set the _id field directly to $_id, and store the other variables into a different field.
- Either reduce the number of iterations or reduce the number of partition size to decrease the number of documents on the collection.