-
Type: New Feature
-
Resolution: Unresolved
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Python Drivers
PyMongoArrow's treatment of the "binary" data type is murky. It is handled as a PyArrow ExtensionType. It is closest to PyArrow's FixedSizeBinaryType. This ticket is to add support for pyarrow.binary() and pyarrow.large_binary. The following gives a sense of the so-called murkiness.
pa.binary() Out[12]: DataType(binary) pa.binary(12) Out[13]: FixedSizeBinaryType(fixed_size_binary[12]) pa.large_binary() Out[14]: DataType(large_binary) from pymongoarrow.types import BinaryType BinaryType(10) Out[16]: BinaryType(DataType(binary))
More concretely, the following attempt to write a pyarrow.Table with DataType(binary) fails.
import pyarrow as pa from pymongoarrow.api import write from pymongo import MongoClient coll = MongoClient().db.coll aschema = pa.schema([("Binary", pa.binary())]) table_in = pa.Table.from_pydict({"Binary": [b"1", b"one"]}, schema=aschema) write(coll, table_in)
with the following
File "/Users/casey.clements/src/mongo-arrow/bindings/python/pymongoarrow/api.py", line 432, in write _validate_schema(tabular.schema.types) File "/Users/casey.clements/src/mongo-arrow/bindings/python/pymongoarrow/types.py", line 324, in _validate_schema raise ValueError(msg) ValueError: Unsupported data type "binary" in schema
- is related to
-
INTPYTHON-251 Support fixed size binary data type
- Backlog
- related to
-
INTPYTHON-52 Add support for BSON binary type
- Closed