-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Catalog and Routing
On secondaries, some collections (replicated + oplog) should be read at the last applied timestamp whereas others (non-replicated) should be read at latest. We have logic in both the shard role api and the auto getters to change this read source if needed.
However, it appears that the behavior is different in the auto getter path than in the shard role - in the auto getters, we only consider the primary namespace for changing the read source whereas in the shard role, we check all namespaces with an early exit if any namespace requires kLastApplied. It is unclear to me which of these behaviors (if either) is desired. I think answering this would require answering the question of what read source we should use if we are doing an operation over both a replicated (kLastApplied) and non-replicated (kNoTimestamp) collection. For example, because we use the shard role for aggregations, the following aggregation will currently use kNoTimestamp even though one of the collections (the oplog) requires reading at last applied. And I would expect the same to be true in reverse for the opposite lookup.
assert.commandWorked(secondary.getDB("local").getCollection("foo").insertMany([{"op": 'c'}, {"op": "n"}])); let res = secondary.getDB("local").getCollection("foo").aggregate([{$lookup: {"from": "oplog.rs", localField: "op", foreignField: 'op', as: "entries"}}]);
Additionally, it is not clear to me whether it is safe that in the autoGetters, we do not check if the read source needs changed for sub operations (I guess that this is in line with the fact that they only check the primary namespace, but I am not sure that is correct). I think the equivalent question in the shard role would be whether it is correct to no check the read source if the catalog snapshot is already acquired.