Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-421

Review and merge PR for "Issue with isJsonObjectOrArray method" fix from customer

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Unknown Unknown
    • 10.3.0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Java Drivers
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      TLDR: A user provided a fix for an issue they discovered. Review, test and Merge the code if valid.

      USER REPORTED ISSUE:

      MongoDB Spark Connector has a bug in the method: isJsonObjectOrArray
      File: RowToBsonDocumentConverter.java
      Method: isJsonObjectOrArray (Line: 221)
      Ref: https://github.com/mongodb/mongo-spark/blob/main/src/main/java/com/mongodb/spark/sql/connector/schema/RowToBsonDocumentConverter.java?#L221 1

      Issue: Code always assumes string is not empty and access index 0/1. So when data has empty string ‘’ , get error: Can not convert to Bson, Index out of range

      Suggested Fix: Its just bool method checking if value should be converted to BSON or not. Just returning false on empty strings will do?

      OR provide easier way to just convert ONLY specified column to convert to BSON. We only need to use this for one column - Id to ObjectId - But as its not supported, we have to use it as top-level option that applies to all Object/Arrays.

       

      USER PROVIDED FIX
      Below fix I tried on cloned repo and tested - Seems to fix this issue:
      Added first 3 lines below to method: isJsonObjectOrArray

      Is it possible to get this fix applied to mongo-spark connector repo and get updated .jar file?

       

      {{private static boolean isJsonObjectOrArray(final String data) { if (data == null || data.isEmpty() || data.length() < 2)
      { return false; }
      char firstChar = data.charAt(0);
      char lastChar = data.charAt(data.length() - 1); return (firstChar == JSON_OBJECT_START && lastChar == JSON_OBJECT_END)
      
         

       
       
      Reference 
       

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            prakul.agarwal@mongodb.com Prakul Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: