-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
I was investigating SERVER-93369 and was shocked to find a lock acquisition at this point in the code. It looks like the parser for $text is interested in finding out the index specification for any applicable text index to (1) validate its presence, (2) find out the index version, and (3) find out the indexes default language. Looking a little further on, it looks like this is all to figure out a tokenizer to understand terms to search for.
This feels like a layering violation. Parsing shouldn't really care about these things, or at least not in this way. I think there are a couple fixes worth investigating:
- Steps in the right direction, probably easier:
- pass in a reference to a collection pointer. This at least makes it more obvious to callers that it is happening/needed. In the main command path cases, I would expect that we already have a collection lock available to thread through. I've attempted this in a branch here but it got pretty large and there were some callsites in unit tests/C++ benchmarks which did not have a collection - were probably not concerned with $text and will need a workaround.
- Do this resolution of FTS indexes before parsing the query, and pass the parser a resolved tokenizer. Maybe we could even put a tokenizer on the ExpressionContext, it seems much like a collation?
- And one different suggestion, less well thought out:
- Don't try to tokenize/understand the query directly at parse time, defer the tokenization of the query text until we later find the index and bind to it. I would be interested to compare the validation steps here to the geo index validations.
- is related to
-
SERVER-93369 Fix lock ordering in TextMatchExpression
- Closed