-
Type: Bug
-
Resolution: Fixed
-
Priority: Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
What did I use
- Databricks Runtime Version 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)
- org.mongodb.spark:mongo-spark-connector:10.0.1
- MongoDB 5.0
What did I do
I tried to load the following value from mongodb to databricks
[{ "_id": { "$oid": "6289e26430540f2e5db55f3c" }, "username": "tmp_user_3", "attributes": [] },{ "_id": { "$oid": "6289e26430540f2e5db55f3f" }, "username": "tmp_user_4", "attributes": null },{ "_id": { "$oid": "6289e26430540f2e5db55f3d" }, "username": "tmp_user_2", "attributes": [ { "key": "c", "value": 3 } ] },{ "_id": { "$oid": "6289e26430540f2e5db55f3e" }, "username": "tmp_user_1", "attributes": [ { "key": "a", "value": 1 }, { "key": "b", "value": 2 } ] }]
( spark .read .format("mongodb") .option("database", database) .option("collection", collection) .option("connection.uri", connection_uri) .load() .display() )
What did I get
the data can't be read from mongodb
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 79.0 failed 1 times, most recent failure: Lost task 0.0 in stage 79.0 (TID 304) (ip-10-172-164-192.us-west-2.compute.internal executor driver): com.mongodb.spark.sql.connector.exceptions.DataException: Invalid field: 'attributes'. The dataType 'array' is invalid for 'BsonNull'.
What do I expect
The dataframe is displayed
- related to
-
SPARK-351 array field with null value is not written to mongodb
- Closed