-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
135
Ad-hoc tests suggest that we can improve performance of last point queries by changing our query rewrite.
Consider the following user-specified query:
db.telemetry.aggregate([ {$sort: {"metadata.sensorId": 1, "timestamp": 1}}, {$group: { _id: "$metadata.sensorId", ts: {$last: "$timestamp"}, temp: {$last: "$temp"} }} ]);
which we usually would rewrite to:
db.system.buckets.telemetry.aggregate([ {$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}}, {$group: { _id: "$meta.sensorId", bucket: {$first: "$_id"}, }}, {$lookup: { from: "system.buckets.telemetry", foreignField: "_id", localField: "bucket", as: "bucket_data", pipeline:[ {$_internalUnpackBucket: { timeField:"timestamp", metaField:"tags", bucketMaxSpanSeconds:NumberInt("60") }}, {$sort: {"timestamp": -1}}, {$limit:1} ] }}, {$unwind: "$bucket_data"}, {$replaceWith:{ _id: "$_id", ts: "$bucket_data.timestamp", temp: "$bucket_data.temp" }} ]);
We actually can get the same results with slightly better runtime by avoiding the $lookup using the following alternative rewrite:
db.system.buckets.telemetry.aggregate([ {$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}}, {$group: { _id: "$meta.sensorId", bucket: {$first: "$_id"}, control: {$first: "$control"}, meta: {$first: "$meta"}, data: {$first: "$data"} }}, {$_internalUnpackBucket: { timeField:"timestamp", metaField:"meta", bucketMaxSpanSeconds:NumberInt("60") }}, {$sort: {"meta.sensorId": 1, "timestamp": -1}}, {$group: { _id: "$meta.sensorId", ts: {$first: "$timestamp"}, temp: {$first: "$temp"} }} ]);
This optimization was suggested by a comment from david.percy on the tech design for PM-2330:
The tweak described above improved runtime from 210ms to 140ms in my tests with a debug build on the following data set: https://gist.github.com/starzia/9d1f8a25a2e2e2124b78e2da71159602
However, genny tests showed every larger larger latency improvements – more than a 7x speedup
- is duplicated by
-
SERVER-61659 TS Last Point opt: final test plan review
- Closed