-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: JavaScript, Querying, Shell
-
None
-
Query Optimization
It's sometimes desirable to take the results of a find() (or something else that returns a cursor, like aggregation) and store the resulting documents somewhere, eg. in some other collection, or in a json or bson file (ala SERVER-12624).
The idea is that while working interactively in the shell, once you find a query that works well you can save the results (for use by some other tool) quickly and easily by just adding ".saveTo({ns: "db.coll" })" or .saveTo({ file: "output.json" }) to the end of the line, eg:
db.foo.find( { something: "value" }, { something: 1, interesting: 1 } ).limit(5000).saveTo({ db: "some", collection: "where" }) db.foo.find( { something: "value" }, { something: 1, interesting: 1 } ).limit(5000).saveTo({ file: "sample.json" })
A naive client-side JS implementation might be something like:
DBQuery.prototype.saveTo = function(target) { if (target.db || target.collection || target.ns) { if (target.db && target.collection) { t = this._mongo.getDB(target.db).getCollection(target.collection); } else if (target.collection) { t = this._db.getCollection(target.collection); } else if (target.ns) { t = this._mongo.getCollection(target.ns); } while (this.hasNext()) t.insert(this.next(), target.options, target._allow_dot); } else if (target.file) { if (target.type === undefined) { if (target.file.endsWith(".json")) { target.type = "json"; } else if (target.file.endsWith(".bson")) { target.type = "bson"; } } if (target.type == "bson") { // SERVER-12624 this.dump(target.file); } else if (target.type == "json") { if (target.pretty) { oneline = target.pretty ? false : true; } if (target.oneline) { oneline = target.oneline ? true : false; } // needs fprint() (SERVER-14880) while (this.hasNext()) fprint(target.file, tojson(this.next(), "", oneline)); fclose(target.file); } } };
This might be good enough to start with. Doing this server-side (to eliminate the network traffic) would be possible by using db.eval with nolock.
Further improvements might include:
- using bulk inserts to insert a full cursor batch at a time
- server-side support for an $out parameter for finds (like $out for Map-Reduce and Aggregation). Instead of returning the cursor to the client, the server would internally iterate over the cursor and insert the results to the specified collection, and returns the status of this procedure. In this case, the shell saveTo() implementation would reduce to calling _addSpecial, and it could appear anywhere in the function chain rather than only at the end. The file-based output would probably remain client-side, though.
- duplicates
-
SERVER-14880 Ability to output to file from mongo shell
- Backlog
-
SERVER-12624 Support writing to (bson) files from shell
- Closed