Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29466

Cannot query over fields containing invalid UTF-8

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.4
    • Component/s: None
    • None
    • Query
    • ALL
    • Query 2017-10-02

      A string search or comparison query like the following fails to return any documents that contains special (unprintable?) characters:

      Comparison: { "string": "substring ���콻�� substring" } 
      Search: { "string": /.*substring.*/i } 
      

      Example document:

      { 
          "_id" : ObjectId("593708460121722b9463f7a1"), 
          "string" : "substring ���콻�� substring"
      }
      

      This causes me to be unable to directly find such documents in my database, which is very annoying. If you wonder why, from time to time such special characters can occur in user-generated data in my DB. This is not a bug and by design / allowed.

      The original document is saved using the C driver - if you copy and paste it from here to insert into a database the bug will not show as the unprintable characters seem to become UTF-8 character which are fine (e.g. there are no issues when searching a string with just Chinese in it).

            Assignee:
            backlog-server-query Backlog - Query Team (Inactive)
            Reporter:
            bastian Bastian Suter
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: