I need to store documents in mongodb that contains list with null values like
[ {"id": 1, "type": "good record", "attributes": {"params": "anyparam", "liste": ["a", "b", "c"]}}, {"id": 2, "type": "bad record", "attributes": {"params": "anyparam", "liste": ["d", null, null, "e"]}} ]
when trying to save them I have the following exception :
Caused by: com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast null into a StringType at com.mongodb.spark.sql.MapFunctions$$anonfun$com$mongodb$spark$sql$MapFunctions$$wrappedDataTypeToBsonValueMapper$1.apply(MapFunctions.scala:87) at com.mongodb.spark.sql.MapFunctions$$anonfun$com$mongodb$spark$sql$MapFunctions$$wrappedDataTypeToBsonValueMapper$1.apply(MapFunctions.scala:83) at com.mongodb.spark.sql.MapFunctions$$anonfun$12$$anonfun$apply$9.apply(MapFunctions.scala:158) at com.mongodb.spark.sql.MapFunctions$$anonfun$12$$anonfun$apply$9.apply(MapFunctions.scala:158)
I start with a .map() applying json.dumps to convert data to valid json. At this step, values in my RDD are correct. But mongodb need a dataframe to write to db.
So moving data to a dataframe give me a schema that is compliant and null json values are converted back to python None values.
But at last, when attempting to write data, the driver seems to misinterpret this dataset.