There are some cases where null keys are generated even though a general query should not match null, specifically so that SERVER-3377 can be implemented efficiently.
Because of these null keys, the fast count optimization should be prevented when matching against null.
Test forthcoming.
Background
Index Key Extraction
In many cases, indexing a nested field within an array uses an existing value from the document. Eg for index
{ 'a.b':1 }document { a:[ { b:5 } ] } -> produces index key 'a.b':5 { a:[ { b:5 }, { b:6 } ] } -> two keys 'a.b':5, 'a.b':6
If the nested field is missing however, a null value is stored in the index
document { } (no 'a' field present) -> produces index key 'a.b':null { a:[ ] } -> 'a.b':null { a:[ {} ] } -> 'a.b':null { a:[ { x:1 } ] } -> 'a.b':null { a:[ 7 ] } -> 'a.b':null
In the case where some array values have an 'a.b' field and some do not, a mixture of null and non null index keys is produced:
{ a:[ { b:5 }, {} ] } -> 'a.b':5, 'a.b':null { a:[ { b:5 }, { b:6 }, 99 ] } -> 'a.b':5, 'a.b':6 'a.b':null
Query Matching Semantics
For a simple query, a request for null will match missing values only if there are no non missing values for the key. But a request for null will always match an explicit null value. For query
{ 'a.b':null }
{ } (empty document) matches { a:[ ] } matches { a:[ 88 ] } matches { a:[ { b:2 } ] } does not match because there is an existing value of 'a.b' { a:[ { b:2 }, { } ] } does not match because there is an existing value of 'a.b', even though there is also a missing 'b' within another array element { a:[ { b:2 }, { b:null } ] } does match because there is an explicit null value of 'a.b'.
However, the $elemMatch operator will restrict matching to individual array elements. If a single array element has a missing 'b', the document will match null. For query
{ a:{ $elemMatch:{ b:null } } }
{ a:[ { b:2 } ] } does not match because the only value of b is 2 { a:[ { b:2 }, { } ] } matches because there is a missing value of 'b' in the second array entry (note difference from non elemMatch query example above) { a:[ { b:2 }, { b:null } ] } does match because there is an explicit null value of 'a.b'.
This behavior was requested in SERVER-3377.
- is duplicated by
-
SERVER-4491 count() on a find() returns scanned count instead of result count
- Closed
- is related to
-
SERVER-6293 Index only query fills in missing values with null
- Closed
- related to
-
SERVER-4717 consider removing fast count mode, otherwise refactor it
- Closed