-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
Currently, queries on views can be handled in four separate ways:
- "normal" queries on views - runAggregateOnView, where the namespace gets resolved (to the underlying namespace) and the request pipeline gets appended to the end of the view pipeline, is called. runAggregateOnView then calls runAggregate on this expanded aggregate request and resolved namespace.
- $collStats queries on non-timeseries collection - runAggregateOnView is never called because collStats stage supports running on view namespace eg via this helper
- $collStats queries on timeseries collection - runAggregateOnView is called (expanding the request and resolving the namespace) because we need the underlying namespace to get the underlying bucket associated with the ts coll
- $search/$vectorSearch/$searchMeta queries on views - runAggregateOnView is called to resolve the underlying namespace but the request is not expanded to include the view pipeline. This is because _idLookup applies the view pipeline.
In an effort to reduce the number of execution flows for queries on views - we should investigate not calling runAggregateOnView for search queries on views. For this approach, we would need getInvolvedNamespaces() to always return the current operation namespace (in addition to the referenced foreign namespaces). This way, the view nss + {view pipeline, underlying collection, UUID} will be automatically saved to expCtx's _resolvedNamespaces map(instead of handling it separate search_helpers). Then, the UUID can always be pulled from this map for search queries on views in DocumentSourceInternalMongotRemote. Otherwise, the collection ptr for MainCollection can be null for a view and we will never enter this if block(eg never set uuid on the expCtx).