bbknn
bbknn copied to clipboard
ValueError: No hyperplanes of adequate size were found! When not using annoy
Hi there,
Having an issue when I try to run BBKNN without annoy. Had this error, then freshly installed everything in a new conda environment, I'm still getting the error passing from pynndescent when I run the code:
bbknn.bbknn(adata,batch_key='batch_name',use_annoy=False,metric='manhattan',neighbors_within_batch=3)
Thanks so much! This package works amazingly for correcting batch-driven compositional problems!!
Full error message below:
122 batch_list = adata.obs[batch_key].values
123 #call BBKNN proper
--> 124 bbknn_out = bbknn_matrix(pca=pca, batch_list=batch_list, approx=approx,
125 use_annoy=use_annoy, metric=params['metric'], **kwargs)
126 #store the parameters in .uns['neighbors']['params'], add use_rep and batch_key
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/bbknn/matrix.py in bbknn(pca, batch_list, neighbors_within_batch, n_pcs, trim, approx, annoy_n_trees, pynndescent_n_neighbors, pynndescent_random_state, use_annoy, use_faiss, metric, set_op_mix_ratio, local_connectivity)
312 params = check_knn_metric(params, counts)
313 #obtain the batch balanced KNN graph
--> 314 knn_distances, knn_indices = get_graph(pca=pca,batch_list=batch_list,params=params)
315 #sort the neighbours so that they're actually in order from closest to furthest
316 newidx = np.argsort(knn_distances,axis=1)
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/bbknn/matrix.py in get_graph(pca, batch_list, params)
173 ind_to = np.arange(len(batch_list))[mask_to]
174 #create the faiss/cKDTree/KDTree/annoy, depending on approx/metric
--> 175 ckd = create_tree(data=pca[mask_to,:params['n_pcs']], params=params)
176 for from_ind in range(len(batches)):
177 #this is the batch that will have its neighbours identified
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/bbknn/matrix.py in create_tree(data, params)
95 n_neighbors=params['pynndescent_n_neighbors'],
96 random_state=params['pynndescent_random_state'])
---> 97 ckd.prepare()
98 elif params['computation'] == 'faiss':
99 ckd = faiss.IndexFlatL2(data.shape[1])
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/pynndescent/pynndescent_.py in prepare(self)
1524 def prepare(self):
1525 if not hasattr(self, "_search_graph"):
-> 1526 self._init_search_graph()
1527 if not hasattr(self, "_search_function"):
1528 if self._is_sparse:
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/pynndescent/pynndescent_.py in _init_search_graph(self)
962 best_trees = [self._rp_forest[idx] for idx in best_tree_indices]
963 del self._rp_forest
--> 964 self._search_forest = [
965 convert_tree_format(tree, self._raw_data.shape[0])
966 for tree in best_trees
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/pynndescent/pynndescent_.py in <listcomp>(.0)
963 del self._rp_forest
964 self._search_forest = [
--> 965 convert_tree_format(tree, self._raw_data.shape[0])
966 for tree in best_trees
967 ]
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/pynndescent/rp_trees.py in convert_tree_format(tree, data_size)
1161 if tree.hyperplanes[0].ndim == 1:
1162 # dense hyperplanes
-> 1163 hyperplane_dim = dense_hyperplane_dim(tree.hyperplanes)
1164 hyperplanes = np.zeros((n_nodes, hyperplane_dim), dtype=np.float32)
1165 else:
~/utils/miniconda3/envs/scanpy/lib/python3.9/site-packages/pynndescent/rp_trees.py in dense_hyperplane_dim()
1143 return hyperplanes[i].shape[0]
1144
-> 1145 raise ValueError("No hyperplanes of adequate size were found!")
1146
1147
ValueError: No hyperplanes of adequate size were found!```
I figured out that if you set pynndescent_n_neighbors
to a lower number it solves this issue. Perhaps there should be an internal conditional or this!
Thanks for the kind words.
Good catch - I've already got a second condition in place for pynndescent, as it seems unable to process 10 or fewer observations no matter how you tweak the parameterisation:
https://github.com/Teichlab/bbknn/blob/d2d5a65638008a4837261eaaa464223f8db36fef/bbknn/matrix.py#L220-L222
In testing, it appears that the default pynndescent neighbour count of 30 runs just fine on data with 31 observations. You've got some super tiny batches going on, is that intentional?
Not intentional, but was working on a subsample of a huge dataset so it makes sense that a few of the batches ended up very small!
On Thu, Sep 30, 2021, 14:18 Krzysztof Polanski @.***> wrote:
Thanks for the kind words.
Good catch - I've already got a second condition in place for pynndescent, as it seems unable to process 10 or fewer observations no matter how you tweak the parameterisation:
https://github.com/Teichlab/bbknn/blob/d2d5a65638008a4837261eaaa464223f8db36fef/bbknn/matrix.py#L220-L222
In testing, it appears that the default pynndescent neighbour count of 30 runs just fine on data with 31 observations. You've got some super tiny batches going on, is that intentional?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Teichlab/bbknn/issues/48#issuecomment-931711895, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGX7H2ZIS57PKGA44JNQN7TUETHYXANCNFSM5FC5CCIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Reopening as I'll need to add a workaround into the code. This is not pressing as having 30 cell batches is not the most common.
Oops sorry!