mongo-arrow icon indicating copy to clipboard operation
mongo-arrow copied to clipboard

MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.

Results 26 mongo-arrow issues
Sort by recently updated
recently updated
newest added
trafficstars

### Function parameters example ```py def write(collection, tabular, *, exclude_none: bool = False): ... ``` ### Usage example ```py write(collection, df, exclude_none=True) ``` ### How Replacing https://github.com/mongodb-labs/mongo-arrow/blob/main/bindings/python/pymongoarrow/api.py#L390 with ```py if...

enhancement

Goal: Trying to read a mongo document with an embedded object containing an empty array to a pyarrow table, then write it out as a parquet file. Expected result: Parquet...

bug

I'm reproducing a bug in airflow with the docker-compose method to run airflow2.8.1 with python 3.11 ( https://airflow.apache.org/docs/apache-airflow/2.8.1/howto/docker-compose/index.html#fetching-docker-compose-yaml ). I'm creating a requirements.txt with the following packages : ``` pymongo==4.6.1...

documentation

Hi, when I use pymongoarrow.api.aggregate_arrow_all() it seems to return Decimal128 as FixedSizeBinary when [context.finish()](https://github.com/mongodb-labs/mongo-arrow/blob/main/bindings/python/pymongoarrow/context.py#L114) is called. When looking at the code, my assumption is, it stems from [lib.pyx](https://github.com/mongodb-labs/mongo-arrow/blob/main/bindings/python/pymongoarrow/lib.pyx#L763) where `return...

duplicate

.. or zero copy appear only between `arrow->pandas` but not here `mongodb->arrow`? In other words are arrow data types used in mongodb?

answered

I have a mongo document which has a list field containing child documents. Pandas data frames [can be nested](https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe). And PyArrow has `Table` and `RecordBatch` types. I would like to...

linked-to-jira

Hi, Thanks again for fixing the bugs in Version 1.0.2. Unfortunately it seems that the new version loads data approx.. >four times slower in case there are nested fields in...

linked-to-jira

Hi, i'm facing this issue when to try make my mongo collection into pandas dataframe using the find_pandas_all() function authors_pyarrow = Schema({"_id": ObjectId, "first_name": pyarrow.string(), "last_name": pyarrow.string(), "date_of_birth": datetime}) df...

waiting-for-author

I was trying mongo arrow to load a dataset from mongodb, it is loading the selected columns only that's saving space, but the dataframe is all Nat and Nones only....

Hi, when using pymongoarrow.api.aggregate_arrow_all() it seems to omit columns that would contain only null values. #### Field "email" with None only ```python data = [ {"name": "Charlie", "email": None}, {"name":...