When trying to apply a filter on a sub-column whose name contains a dash, the filter wrongly returns no rows whatever the condition is.
A similar problem existed before 10.2.0 with any field containing a dash. The problem has been partially fixed with 10.2.0 but still appears for fields no being at the root level:
dataset.filter(col("some-field").lt(100)); Successdataset.filter(col("main.someField").lt(100)); Successdataset.filter(col("main.some-field").lt(100)); // Fails silently (returns an empty dataset even if they are matching rowsdataset.cache().filter(col("main.some-field").lt(100)); // Success, since we do the filtering in spark rather than in the connector
- is related to
-
SPARK-393 Broken filter on column read from a MongoDB field with dash characters
- Closed