Here are investigation results for BF-23718:
- They are the same issue which is related to the new SBE spilling behavior
- The pipeline in both issues has $group stage on $obj.str group-by key in which all documents has different values and number of documents is 200. The HashAgg checks whether the estimated memory usage is over the allowed maximum memory usage per every 100 groups. So, at the 200th group, the HashAgg realizes that the memory usage over the configured value and errors out when allowDiskUse is off.
- Then why doesn’t mongod-5.0 fail? Because we take the classic engine DocumentSourceGroup’s code path at mongod-5.0 and the default maximum memory usage is 100 * 1024 * 1024 (==100MB). But in SBE HashAggStage, the default maximum memory usage is 1024 * 1024.
DocumentSourceGroup’s default maximum memory usage:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/query/query_knobs.idl#L330-L338
HashAggStage’s default maximum memory usage:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/query/query_knobs.idl#L527-L536
So, my recommendation is to update HashAggStage’s default maximum memory usage to 100 * 1024 * 1024.