Context
Currently (assuming this PR is merged as-is) users must decode Vector data (BSON Binary subtype 9) using the bson.Vector type. If they then want to manipulate the numerical data, they must perform an extra step to extract/decode it. We generally assume that users know what type of Vector data is stored, in which case the extra step is unnecessary. For the "int8" and "float32" type vectors, we should give users a shortcut to decode the numerical data directly to a []int8 or []float32.
For example, the following should work:
var coll *mongo.Collection d := bson.D{{ "vec", bson.NewVector([]int8{0, 1, 2, 3}), }} coll.InsertOne(..., v) var res struct { Vec []int8 } coll.FindOne(...).Decode(&res)
For PACKED_BIT type, there's the bit array value and the padding value, so the Vector type is still the best type to decode into, despite the additional step required to get to the data. It's not clear which Vector types will be most widely used.
Definition of Done
- Users must be able to decode int8 Vector data into a struct field that is type []int8.
- Users must be able to decode float32 Vector data into a struct field that is type []float32.
Pitfalls
- Our understanding of how customers will use Vector data in Go applications is limited. This suggested improvement may seem more or less useful as our understanding increases.
- We currently assume that there are equal or more use cases for "int8" and "float32" vector data compared to PACKED_BIT data. However, it's possible that we're wrong and PACKED_BIT will get way more use, making this suggested improvement much less useful.
- The Vector data format may change in a way that makes []int8 and []float32 unable to represent the Vector without losing data.