-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.4.4
-
Component/s: Text Search
-
Query Integration
Hi all,
I'm using MongoDB text search, and I'd like to give some feedback. I'm not sure what the best way is to do so, so I've made this report. If there's a more preferred way, please let me know, so I can use that way in the future.
Based on this document: http://docs.mongodb.org/manual/tutorial/create-text-index-on-multi-language-collection/, I've made some testcase, and I don't understand what's happening.
This is my test data:
{ "_id" : 1, "language" : "portuguese", "quote" : "A sorte protege os audazes" } { "_id" : 2, "language" : "spanish", "quote" : "Nada hay más surreal que la realidad." } { "_id" : 3, "language" : "english", "quote" : "is this a dagger which I see before me" } { "_id" : 4, "language" : "dutch", "quote" : "is dit een dolk die ik voor mij zie" } { "_id" : 5, "language" : "dutch", "quote" : "vol verbijstering zaten de dames naar de twee honden te kijken" }
And I'm most interested in finding the Dutch results right now.
It seems like the stemmer is not working for some words:
> db.quotes.runCommand( "text", { search: "honden", language:"dutch" } ) Correct result: 1 (queryDebugString: 'hond') > db.quotes.runCommand( "text", { search: "hond", language:"dutch" } ) Correct result: 1 (queryDebugString: 'hond') db.quotes.runCommand( "text", { search: "dames", language:"dutch" } ) Correct result: 1 (queryDebugString: 'dames') db.quotes.runCommand( "text", { search: "dame", language:"dutch" } ) Incorrect result: 0 (queryDebugString: 'dam')
Note that the plural for hond ('dog') is honden (dogs)
The plural for dame ('lady') is dames (ladies)
However, MongoDB text search doesn't seem to understand this, and returns nothing. In my opinion, this seems like a bug?
- is related to
-
SERVER-9537 Full text search in Dutch does incorrect stemming for words that end with "sen"
- Closed