-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.0.5
-
Component/s: Querying
-
None
-
ALL
-
It costs more than 20 mins to query 1 hour's data, which is about 10,000 doucuments, and the server's 100GB memory can be used up by these queries.
Details:
I setup a server with wiredtiger engine, config like:
wiredTiger: engineConfig: cacheSizeGB: 100 statisticsLogDelaySecs: 0 journalCompressor: snappy directoryForIndexes: true collectionConfig: blockCompressor: snappy indexConfig: prefixCompression: true
I have a collection named 'orders'. The documents look like:
{ order_id:485548, order_status: [ { update_time:ISODate("2015-10-28T02:45:00Z"), status:1 }, { update_time:ISODate("2015-10-28T03:18:00Z"), status:2 } ] }
I built an index:
"key" : {"order_status.update_time" : -1, "order_status.status" : -1}
then I run this query:
db.order.find({ "order_status" : { "$elemMatch" : { "update_time" : { "$gte" : ISODate("2015-10-28T02:00:00Z"), "$lt" : ISODate("2015-10-28T03:00:00Z") }, "status" : 2 } } }).explain();
I get:
"indexBounds" : { "order_status.update_time" : [ "[new Date(1446001200000), true)" ], "order_status.status" : [ "[2.0, 2.0]" ] }
I guess this means scan all data before '2015-10-28T03:00:00Z', which is more than 40GB data.
after I change the query to
db.order.find({ "order_status" : { "$elemMatch" : { "update_time" : { "$gte" : ISODate("2015-10-28T02:00:00Z") }, "status" : 2 } } }).explain();
I get:
"indexBounds" : { "order_status.update_time" : [ "[new Date(9223372036854775807), new Date(1445997600000)]" ], "order_status.status" : [ "[2.0, 2.0]" ] }
This looks better and runs much faster (finished within a second), but how can I get my original query work?
- duplicates
-
SERVER-20616 Plan ranker sampling from the beginning of a query's execution can result in poor plan selection
- Backlog
- related to
-
SERVER-15086 Allow for efficient range queries over non-array fields in multikey indices
- Closed