-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.2.29
-
Component/s: MongoDB 3.4
-
Environment:Mongo server v3.4.5
Lib mongodb v2.2.29
Node v8.1.2
-
Empty show more show less
When trying to insert millions of entries, to a collection, Javascript heaps out of memory.
I have an array of mongo objects to insert containing ~3500000 similar objects.
My initial attempt was to do the following:
await myCollection.insertMany(objectsToCreate);
But it took so long that I went away and tried using the bulk operator:
myCollection.initializeOrderedBulkOp();
for(let i = 0, len = objectsToCreate.length; i < len; i++) {
bulk.insert(objectsToCreate[i]);
}
await bulk.execute();
But got the same problem as before. Because I saw that splitting in smaller batches could solve the problem, I also tried it without success.
Here is my final attempt with its output below:
for(let i = 0, len = objectsToCreate.length; i < len; i++) { if(i % 10000 === 0) { if (i !== 0) { await bulk.execute({w: 0}); } bulk = col.initializeUnorderedBulkOp(); } if (i % 100000 === 0 && i !== 0) { console.log(i + ' took: ' + ((new Date()).getTime() / 1000 - timer)); timer = (new Date()).getTime() / 1000; } bulk.insert(objectsToCreate[i]); }
100000 took: 3.563999891281128
200000 took: 2.8480000495910645
300000 took: 2.7950000762939453
400000 took: 2.691999912261963
500000 took: 3.384999990463257
600000 took: 2.8000001907348633
700000 took: 2.7659997940063477
800000 took: 3.4050002098083496
900000 took: 2.7119998931884766
1000000 took: 2.7929999828338623
1100000 took: 3.5450000762939453
1200000 took: 2.7890000343322754
1300000 took: 3.5269999504089355
1400000 took: 2.763000011444092
1500000 took: 3.674999952316284
1600000 took: 2.749000072479248
1700000 took: 3.615000009536743
1800000 took: 3.734999895095825
1900000 took: 3.746000051498413
2000000 took: 3.806999921798706
2100000 took: 3.8530001640319824
2200000 took: 3.806999921798706
2300000 took: 5.950000047683716
2400000 took: 24.04800009727478
2500000 took: 145.318999767303467
<--- Last few GCs --->
[13137:0x424d5c0] 176728 ms: Mark-sweep 1418.5 (1477.6) -> 1418.6 (1450.1) MB, 4820.6 / 0.0 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 4821 ms) last resort
[13137:0x424d5c0] 181517 ms: Mark-sweep 1418.6 (1450.1) -> 1418.3 (1450.1) MB, 4788.8 / 0.0 ms last resort
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x67acea9891 <JS Object>
1: serializeInto(aka serializeInto) [/home/admin/parser/node_modules/bson/lib/bson/parser/serializer.js:~574] [pc=0x292d9e25d587](this=0x276096182311 <undefined>,buffer=0x22925dad38e9 <an Uint8Array with map 0x1586b0f315e9>,object=0x4579b44f11 <an Object with map 0x38fe34692a59>,checkKeys=0x2760961823b1 <true>,startingIndex=64657,depth=2,serializeFunctions=0x276096182421 <false>,ignore...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: node::Abort() [node]
2: 0x13d443c [node]
3: v8::Utils::ReportOOMFailure(char const*, bool) [node]
4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
5: v8::internal::Factory::NewFixedArray(int, v8::internal::PretenureFlag) [node]
6: v8::internal::LoadIC::LoadNonExistent(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>) [node]
7: v8::internal::LoadIC::UpdateCaches(v8::internal::LookupIterator*) [node]
8: v8::internal::LoadIC::Load(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>) [node]
9: v8::internal::Runtime_LoadIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*) [node]
10: 0x292d9de8437d
Aborted
Note that this code was executed on a server that has 8Gb of ram and did not go out of memory during the process (there was still 3Gb available at the moment of the crash).
My mongo connection was initialized with all the default values and the database is running on the same machine as the one executing the script.
During the execution node took more and more RAM which I think should not happen as I don't store anything between the iterations (up to 2Gb).
If you have any clue on how to solve this issue, it would really help . Note that I don't need the output of the insertion, I just need to populate the database...