squidpy
squidpy copied to clipboard
added design matrix and plotting function
IMPORTANT: Please search among the Pull requests before creating one.
Description
- Added function to build design matrix containing distances to anchor points which can be used for plotting and model building
- Added plotting function to visualize gene expression by (normalized) distance to anchor points
How has this been tested?
Tested on two data sets
Closes
Codecov Report
Merging #591 (d942632) into main (0cd835d) will increase coverage by
0.07%
. The diff coverage is80.38%
.
Additional details and impacted files
@@ Coverage Diff @@
## main #591 +/- ##
==========================================
+ Coverage 78.56% 78.63% +0.07%
==========================================
Files 31 33 +2
Lines 4492 4699 +207
Branches 865 917 +52
==========================================
+ Hits 3529 3695 +166
- Misses 708 732 +24
- Partials 255 272 +17
Impacted Files | Coverage Δ | |
---|---|---|
squidpy/pl/_graph.py | 79.20% <33.33%> (-2.62%) |
:arrow_down: |
squidpy/tl/_var_by_distance.py | 78.51% <78.51%> (ø) |
|
squidpy/pl/_var_by_distance.py | 88.05% <88.05%> (ø) |
|
squidpy/gr/_sepal.py | 52.63% <100.00%> (+0.35%) |
:arrow_up: |
hi @LLehner
thanks a lot for this PR! looks good! Couple of things before I code review it:
- can you create a new module (folder) named
tl
- can you add the
_design_matrix.py
file there? and rename it to_exp_dist.py
and also the function rename it toexp_dist
. - can you rename the file for plotting to
_exp_dist.py
.
Thank you! Looking forward to add this to Squidpy!
@LLehner I added couple of TODOs on teh function and started skeleton of plotting test. Test for the function itself also should be added.
I think imports are still wrong, see error
n:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
[2819] /home/runner/work/squidpy/squidpy$ /home/runner/work/squidpy/squidpy/.tox/py310-linux/bin/python -m pytest --cov --cov-append --cov-report=term-missing --cov-config=/home/runner/work/squidpy/squidpy/tox.ini --ignore docs/ -vv --test-napari
ImportError while loading conftest '/home/runner/work/squidpy/squidpy/tests/conftest.py'.
tests/conftest.py:23: in <module>
from squidpy.gr import spatial_neighbors
squidpy/__init__.py:1: in <module>
from squidpy import gr, im, pl, read, datasets
squidpy/pl/__init__.py:13: in <module>
from squidpy.pl._feature_by_dist import plot_gexp_dist
E ModuleNotFoundError: No module named 'squidpy.pl._feature_by_dist'
@giovp I did apply the changes, there were some minor things i had to change though. It's on the spatialde repo.
@giovp I did apply the changes, there were some minor things i had to change though. It's on the spatialde repo.
what type of changes?
This fails if i want to compute distances on subset of adata only.
This is plot of spatial of full data:
I then subset the data to 10th of the whole data, retaining balanced cell type proportions
Is it possible that this has to do sth with _prune_anchor_tree which has hard coded parameters?
exp_dist(adata=adata_subset,
groups='CK+ HR+ tumor cell',
cluster_key='cell type',
design_matrix_key = "design_matrix",
batch_key = None,
covariates = None,
metric = "euclidean",
copy = True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [35], line 5
3 print(adata[adata.obs.slide==slide,:])
4 display(adata[adata.obs.slide==slide,:].obs['cell type'].value_counts())
----> 5 exp_dist(adata=adata[adata.obs.slide==slide,:].copy(),
6 groups='CK+ HR+ tumor cell',
7 cluster_key='cell type',
8 design_matrix_key = "design_matrix",
9 batch_key = None,
10 covariates = None,
11 metric = "euclidean",
12 copy = True)
File ~/Documents/GitHub/spatial-de-2022/spatialde/functions/exp_dist.py:99, in exp_dist(adata, groups, cluster_key, design_matrix_key, batch_key, covariates, spatial_key, metric, copy)
95 anchor_coord, batch_coord = _get_coordinates(adata, anchor_var, cluster_key)
97 anchor_coord = _prune_anchor_tree(anchor_coord, 0.05, 4, metric)
---> 99 tree = KDTree(anchor_coord, metric=DistanceMetric.get_metric(metric))
100 mindist, _ = tree.query(batch_coord)
102 if isinstance(anchor_var, np.ndarray):
File sklearn/neighbors/_binary_tree.pxi:833, in sklearn.neighbors._kd_tree.BinaryTree.__init__()
File ~/opt/miniconda3/envs/spatial-de-2022/lib/python3.8/site-packages/sklearn/utils/validation.py:909, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
907 n_samples = _num_samples(array)
908 if n_samples < ensure_min_samples:
--> 909 raise ValueError(
910 "Found array with %d sample(s) (shape=%s) while a"
911 " minimum of %d is required%s."
912 % (n_samples, array.shape, ensure_min_samples, context)
913 )
915 if ensure_min_features > 0 and array.ndim == 2:
916 n_features = array.shape[1]
ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.
Distances dataframe result also contains the ref cell types, not all of which have distance equal to 0.
ref_ct='T cells'
distances=exp_dist(adata=adata,
groups=ref_ct,
cluster_key='cell type',
design_matrix_key = "design_matrix",
batch_key = None, # Currently not working on mock data
covariates = None,
metric = "euclidean",
copy = True)
print('N ref cells:',distances.query('`cell type`=="T cells"').shape[0])
display(distances.query('`cell type`=="T cells"')[
distances.query('`cell type`=="T cells"')[ref_ct]>0])
cells: 290
data:image/s3,"s3://crabby-images/a2a69/a2a69cbca272ccbd1b8aa409ac5f1d92ea30e5f9" alt="image"
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
@LLehner am fixing pre-commits in #643 but tests still fails
@LLehner seems like tests are failing because of
.tox/py/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
exec(co, module.__dict__)
tests/graph/test_design_matrix.py:9: in <module>
from squidpy.tl.exp_dist import exp_dist
E ModuleNotFoundError: No module named 'squidpy.tl.exp_dist'
this is because you need to export exp_dist from tl init.py file, see other modules for reference
linting is instead failing because of pre-commits, I believe you'd have to rerun them but ruff should do modiy in place most of the stuff
@LLehner can you join zulip https://scverse.zulipchat.com/ helmholtz services are all down, will explain there.
also consider to clean up the docstrings by using stuff that we have in docrep already