mongo-arrow
mongo-arrow copied to clipboard
MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.
When using `find` methods with a defined schema, values in MongoDB documents that don’t match the specified schema types (e.g., strings in a field expected to be int64) are silently...
While fetching data with `find_polars_all`, `find_pandas_all`, `find_arrow_all` from `pymongoarrow.api`, the schema is being inferred based on first document. If the same key is having different datatype, it is inferred as...
Hello guys, I am importing a big dataset from mongo: `pd_confirmacao_conversao = find_arrow_all(pd_confirmacao_conversao, {'estadoContabilizacaoEvento': { '$lt': 100}})` After that, I´ve just exported it to a pandas dataframe `pd_confirmacao_conversao = pd_confirmacao_conversao.to_pandas()`...
Hello guys, I would like to discuss about setting the Schema for find_arrow_all or find_pandas_all. I have a database with several columns, two of them are ObjetctIds that are crashing...
we have 3 mongo collections which hold a permission object all 3 slightly different in structure. ``` "permissions": [ { "activity": "never" }, { "pushNotifications": "always", "location": "foreground" } ],...
INTPYTHON-807: reading large amounts of data is rather slow (due to single threaded decoding BSON)
Hi, Over the years, we became quite happy with mongo-arrow and also contributed some bug fixes. However, there is still one topic left that we would be happy to get...