-
Type: Improvement
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 1.6.3, 1.6.4, 1.6.5, 1.7.0, 1.7.1, 1.7.2, 1.7.3, 1.7.4, 1.7.5
-
Component/s: Internal Client, Performance
-
Fully Compatible
In the current C++ driver, calling 'insert' with a BSONObj ends up copying the (already contiguous) data in the BSONObj to newly allocated temporary buffers, then discards the temp buffers. This happens at least two times that I can see:
- Once when calling BSONObj::appendSelfToBufBuilder( b ) in DBClientBase::insert
- Once again when calling Message::setData(int operation, const char *msgdata, size_t len), also in DBClientBase::insert
The Message::setData operation always requires a dynamic allocation, and a subsequent free after the temporary Message object is goes out of scope. The appendSelfToBufBuilder will also require some number of dynamic allocations to resize the BufBuilder as data is copied in.
For large BSONObj objects, this is not optimal. It would be better if the IO strategy here either used multiple independent writes to the TCP stream to write the message header, then the BSONObj, and then any trailing data. Another possibility would be to use vector IO, and to write the header and BSONObj in one go as separate chunks: it looks like there is already some support in Message.cpp for this. For small BSONObjs this probably isn't a big deal, but for larger ones, the overhead of all of this allocating, copying, and freeing is noticeable during profiling.