Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-352

ObjectIds are not supported BsonTypes and can lead to duplications

    • Type: Icon: New Feature New Feature
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 10.0.2
    • Component/s: Schema
    • None

      When trying to update documents, they get duplicated if they are indexed using ObjectIds.

       

      This is caused by the lack of support of the BsonObjectId type, which is read as String. As a result, the documents are duplicated since their _ID does not correspond anymore (String != ObjectId).

       

      Example:

      Dataset<Row> data = sparkSession.read().format("mongodb").option("connection.uri", mongoUri).load();
      data.write().format("mongodb").option("connection.uri", mongoUri).mode(SaveMode.Append).save();

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            cedric.vaneetvelde@soprabanking.com Cedric van Eetvelde
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: