Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Done
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: None
Component/s: Writes
Labels:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

I have a CSV file that I need to save in mongo. My collection already have some data (couple millions) and I need to save into the database just the new ones and ignore the ones that were already in the collection.

How can i do that? I already have the code below:

Map<String, String> writeOverrides = new HashMap<String, String>(); 
writeOverrides.put("collection", this.collection);
writeOverrides.put("replaceDocument", "false");
writeOverrides.put("ordered", "false");
WriteConfig writeConfig = WriteConfig.create(getJavaSparkContext()).withOptions(writeOverrides);
MongoSpark.save(ds.write().mode(SaveMode.Ignore), writeConfig);

I've already tried all the SaveModes and none of them worked the way I need.

PS: I'm using only _id as index

Assignee:: Ross Lawley
Reporter:: Pedro Dib
Reviewers:: None
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Apr 15 2019 05:54:50 PM UTC
Updated:: Sep 22 2021 06:53:14 PM UTC
Resolved:: Apr 30 2019 09:57:00 AM UTC

Details

Description

Attachments

Activity

People

Dates