Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 2.4.0, 2.3.2, 2.2.6, 2.1.5
Affects Version/s: 2.3.1
Component/s: Configuration
Labels:
None
Environment:
Mongo 3.6, Spark 2.3.1

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Loading a DataFrame with an externally configured MongoClientFactory still uses the DefaultMongoClientFactory for parts of the operations e.g.

val ms = MongoSpark.builder()
.sparkSession(sparkSession)
.connector(new MongoConnector(ExternalMongoClientFactory))
.readConfig(SomeReadConfigIncludingTheURI)
.build

seem to use ExternalMongoClientFactory for schema inferment (as expected) but on an actual load with .toDf the internals seem to setup a new mongo connector with new MongoConnector(DefaultMongoClientFactory(options)) (not expected). The DefaultMongoClientFactory seem not to support all the MongoClient options.

This becomes truly apparent when using an ExternalMongoClientFactory that generates MongoClients with interface specific settings such as an socketFactory setting up TLS with a mongo-spark shared PKI/CA.

The expected behaviour is to be able to use all the options of the MongoClient also in MongoSpark.

Assignee:: Ross Lawley
Reporter:: Fredrik Ahlqvist [X]
Reviewers:: None
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Oct 18 2018 03:20:19 PM UTC
Updated:: Oct 28 2023 10:34:09 AM UTC
Resolved:: Dec 07 2018 09:54:43 AM UTC

Details

Description

Attachments

Activity

People

Dates