I am implementing a client version of `mongoexport` in the R driver, which uses bson_as_json to convert bson records to json lines which are then streamed into file or connection. It works, but the bson to json conversion is suboptimal.
{ "_id" : { "$oid" : "5555102760f0cc03b65c8331" }, "Sepal.Length" : 5.100000, "Sepal.Width" : 3.500000, "Petal.Length" : 1.400000, "Petal.Width" : 0.200000, "Species" : "setosa" } { "_id" : { "$oid" : "5555102760f0cc03b65c8332" }, "Sepal.Length" : 4.900000, "Sepal.Width" : 3, "Petal.Length" : 1.400000, "Petal.Width" : 0.200000, "Species" : "setosa" } { "_id" : { "$oid" : "5555102760f0cc03b65c8333" }, "Sepal.Length" : 4.700000, "Sepal.Width" : 3.200000, "Petal.Length" : 1.300000, "Petal.Width" : 0.200000, "Species" : "setosa" }
There are at least two issues. First there is unnecessary whitespace, which is undesired. A bigger issue is the number formatting. It seems like libbson prints doubles with fixed digits which results in trailing zero's or loss of precision for small numbers.
By comparison, the real `mongoexport` utility outputs this for the same data:
{"Petal.Length":1.4,"Petal.Width":0.2,"Sepal.Length":5.1,"Sepal.Width":3.5,"Species":"setosa","_id":{"$oid":"5555102760f0cc03b65c8331"}} {"Petal.Length":1.4,"Petal.Width":0.2,"Sepal.Length":4.9,"Sepal.Width":NumberInt(3),"Species":"setosa","_id":{"$oid":"5555102760f0cc03b65c8332"}} {"Petal.Length":1.3,"Petal.Width":0.2,"Sepal.Length":4.7,"Sepal.Width":3.2,"Species":"setosa","_id":{"$oid":"5555102760f0cc03b65c8333"}}
Ideally output from bson_as_json would be identical to mongoexport, but I understand yajl might have its limitations.
- related to
-
CDRIVER-2063 JSON export prints insignificant digits / noise
- Closed
-
CDRIVER-3377 Double value retrieved from bson is different than expected
- Closed