pinecone-datasets issues

[Bug] Unable to load yfcc-10M-filter-euclidean dataset

1

### Is this a new bug? - [X] I believe this is a new bug - [X] I have searched the existing issues, and I could not find an existing...

yudhiesh

bug

[Bug] HttpError : Invalid bucket name: 'wikipedia-simple-text-embedding-ada-002-100K', 400

7

### Is this a new bug? - [X] I believe this is a new bug - [X] I have searched the existing issues, and I could not find an existing...

David-GERARD

bug

Speedup list_datasets() by 2.5x

## Problem Construction of the Catalog object currently takes ~7.1s to complete. This is significant as both list_datasets() and load_dataset() require the construction of a Catalog object; so essentially _any_...

daverigby

Add dataset_validation tests

2

## Problem We have at least one dataset which has inconsistencies - `langchain-python-docs-text-embedding-ada-002` has an extra duplicated .parquet file which means the dataset ends up with 2x the number of...

daverigby

fix typo

2

## Problem fix typo ## Solution changed spelling ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change...

junefish

[Feature] Add asyncio support

### Is this your first time submitting a feature request? - [X] I have searched the existing issues, and I could not find an existing issue for this feature -...

izellevy

enhancement

[Bug] CI workflow for PR branches always fails

### Is this a new bug? - [X] I believe this is a new bug - [X] I have searched the existing issues, and I could not find an existing...

igiloh-pinecone

bug

TESTING TESTING

Checking if PR CI workflow fails without any code changes

igiloh-pinecone

Bug fix - wrong behaviour for should_create_index

## Problem The `should_create_index` was implemented like `do_create_index` instead of behaving like `allow_existing_index`, which is the original purpose. When the user is setting `should_create_index=False`, he doesn't necessarily mean "**don't** create...

igiloh-pinecone

to_index() should always use gRPC for bulk upserts

@miararoy my bad, I missed this in #22. That this code should never have been merged - it breaks one of the key principles behind `pinecone-datasets` ## Problem One of...

igiloh-pinecone

pinecone-datasets
pinecone-datasets copied to clipboard

Metadata

[Bug] Unable to load yfcc-10M-filter-euclidean dataset

[Bug] HttpError : Invalid bucket name: 'wikipedia-simple-text-embedding-ada-002-100K', 400

Speedup list_datasets() by 2.5x

Add dataset_validation tests

fix typo

[Feature] Add asyncio support

[Bug] CI workflow for PR branches always fails

TESTING TESTING

Bug fix - wrong behaviour for should_create_index

to_index() should always use gRPC for bulk upserts

← Metadata

Owner

Metadata

pinecone-datasets pinecone-datasets copied to clipboard

Metadata

← Metadata

Owner

Metadata

pinecone-datasets
pinecone-datasets copied to clipboard