examples
examples copied to clipboard
[Bug] Unable to run example notebook: pubmed-bm25.ipynb
Is this a new bug?
- [X] I believe this is a new bug
- [X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
0%
0/32 [00:00<?, ?it/s]
---------------------------------------------------------------------------
SparseValuesMissingKeysError Traceback (most recent call last)
[<ipython-input-22-8f2be8886c89>](https://dtujx39ytn-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240502-060125_RC00_630016150#) in <cell line: 5>()
35 # new_vectors = { 'sparse_values': {'indices': indices, 'values': values}}
36 # index.upsert(vectors=new_vectors)
---> 37 index.upsert(vectors=vectors)
38
39 # show index description after uploading the documents
6 frames
[/usr/local/lib/python3.10/dist-packages/pinecone/data/vector_factory.py](https://dtujx39ytn-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240502-060125_RC00_630016150#) in _dict_to_sparse_values(sparse_values_dict, check_type)
104 raise SparseValuesDictionaryExpectedError(sparse_values_dict)
105 if not {"indices", "values"}.issubset(sparse_values_dict):
--> 106 raise SparseValuesMissingKeysError(sparse_values_dict)
107
108 indices = convert_to_list(sparse_values_dict.get("indices"))
SparseValuesMissingKeysError: Missing required keys in data in column `sparse_values`. Expected format is `'sparse_values': {'indices': List[int], 'values': List[float]}`. Found keys [16984, 3526, 2331, 1006, 7473, 2094, 1007, 2003, 1996, 12222, 1997, 4442, 2306, 2019, 15923, 1012, 12922, 3269, 9706, 17175, 18150, 2239, 11934, 27806, 7137, 2566, 29278, 10708, 1999, 2049, 3727, 2083, 8676, 1037, 17779, 6198, 20134, 1998, 18323, 9607, 4372, 20464, 18606, 2024, 29111, 5158, 2012, 2415, 2122, 22901, 15436, 2015, 1010, 7458, 3155, 2274, 2013, 12436, 28817,
Expected Behavior
example notebooks should work without error
Steps To Reproduce
- run https://github.com/pinecone-io/examples/blob/master/learn/search/hybrid-search/fast-intro/pubmed-bm25.ipynb in Colab
- go through steps until error
Relevant log output
0%
0/32 [00:00<?, ?it/s]
---------------------------------------------------------------------------
SparseValuesMissingKeysError Traceback (most recent call last)
<ipython-input-22-8f2be8886c89> in <cell line: 5>()
35 # new_vectors = { 'sparse_values': {'indices': indices, 'values': values}}
36 # index.upsert(vectors=new_vectors)
---> 37 index.upsert(vectors=vectors)
38
39 # show index description after uploading the documents
6 frames
/usr/local/lib/python3.10/dist-packages/pinecone/data/vector_factory.py in _dict_to_sparse_values(sparse_values_dict, check_type)
104 raise SparseValuesDictionaryExpectedError(sparse_values_dict)
105 if not {"indices", "values"}.issubset(sparse_values_dict):
--> 106 raise SparseValuesMissingKeysError(sparse_values_dict)
107
108 indices = convert_to_list(sparse_values_dict.get("indices"))
SparseValuesMissingKeysError: Missing required keys in data in column `sparse_values`. Expected format is `'sparse_values': {'indices': List[int], 'values': List[float]}`. Found keys [16984, 3526, 2331, 1006, 7473, 2094, 1007, 2003, 1996, 12222, 1997, 4442, 2306, 2019, 15923, 1012, 12922, 3269, 9706, 17175, 18150, 2239, 11934, 27806, 7137, 2566, 29278, 10708, 1999, 2049, 3727, 2083, 8676, 1037, 17779, 6198, 20134,
### Environment
```markdown
- **OS**: Google Colab
- **Language version**: Python
- **Pinecone client version**: default
Additional Context
No response
tried again:
0%| | 0/32 [00:01<?, ?it/s]
---------------------------------------------------------------------------
SparseValuesMissingKeysError Traceback (most recent call last)
Cell In[16], line 30
22 vectors.append({
23 'id': _id,
24 'sparse_values': sparse,
25 'values': dense,
26 'metadata': metadata
27 })
29 # upload the documents to the new hybrid index
---> 30 index.upsert(vectors=vectors)
32 # show index description after uploading the documents
33 index.describe_index_stats()
File ~/workspace/third-party/pinecone/examples/venv/lib/python3.11/site-packages/pinecone/utils/error_handling.py:10, in validate_and_convert_errors.<locals>.inner_func(*args, **kwargs)
7 @wraps(func)
8 def inner_func(*args, **kwargs):
9 try:
---> 10 return func(*args, **kwargs)
11 except MaxRetryError as e:
12 if isinstance(e.reason, ProtocolError):
File ~/workspace/third-party/pinecone/examples/venv/lib/python3.11/site-packages/pinecone/data/index.py:171, in Index.upsert(self, vectors, namespace, batch_size, show_progress, **kwargs)
164 raise ValueError(
165 "async_req is not supported when batch_size is provided."
166 "To upsert in parallel, please follow: "
167 "https://docs.pinecone.io/docs/insert-data#sending-upserts-in-parallel"
168 )
170 if batch_size is None:
--> 171 return self._upsert_batch(vectors, namespace, _check_type, **kwargs)
173 if not isinstance(batch_size, int) or batch_size <= 0:
174 raise ValueError("batch_size must be a positive integer")