A CPU profile that was collected to investigate a slow $geoIntersect query when GeoJSON documents contained polygons with thousands of edges showed that we spend nearly 87.5% of the CPU time validating that the polygon is closed and inner loops represent "holes" (see attached flame chart). The profile spends the majority of the time in S2Loop:Contains().
The goal of this ticket is to implement the skipValidation flag that bypasses geometry validation when we execute $geoIntersect queries and there's a 2dsphere index on the stored geometries. There was work done under SERVER-15204 to skip validation but it didn't cover this case. Why can we do this? We already call GeometryContainer::parseFromStorage when getting s2 index keys, so we do this validation when generating index keys.
The performance issue can be replicated by downloading the dump.tgz attachment linked here. Then run mongorestore on a local mongod, and running the following queries.
use BTGST db.Germany_bbox.find({'bbox.0': {$lt: 6.7767088},'bbox.1': {$lt: 51.2217392}, 'bbox.2': {$gt: 6.7767088}, 'bbox.3': {$gt: 51.2217392}, 'geometry': {'$geoIntersects': {'$geometry': {'type': 'Point','coordinates': [6.7767088,51.2217392]}}}}, {_id:0, 'properties.TARIFF_LAYER_ID':1 }).explain("executionStats")
Preliminary hacking showed that the latency of the FETCH stage went from ~4s to ~2s if we skipped validation during reads.
- is related to
-
SERVER-20843 $geoIntersects performs poorly on polygons with lots of coordinates
- Backlog
-
SERVER-15204 Skip validation for stored geometry if a 2dsphere index exists
- Closed