unsupervised_analysis icon indicating copy to clipboard operation
unsupervised_analysis copied to clipboard

A general purpose Snakemake workflow and MrBiomics module to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.

Results 36 unsupervised_analysis issues
Sort by recently updated
recently updated
newest added

test it for e.g., pca.py **Dask**: Dask is a parallel computing library that integrates with pandas, NumPy, and scikit-learn. It can handle larger-than-memory datasets and can distribute the computation across...

enhancement

define too large: e.g., >10,000 samples/cells? ideas - for large data (define too large?) do not do heatmaps showing features and data, but instead determine distance matrices and show those...

enhancement

- idea: a barplot ordered by number of clusters within each clustering - [ ] research alternatives that are common in the field

enhancement

determine metrics at every iteration and plot at the end the time course. at least for the stopping criterion max. edge weight, but maybe also for f1 score and accuracy,....

enhancement

new mini release highlighting bug fixes and adaption to large (120k x 28k) & complex (342 groups of interest/labels) data - [ ] #36 - [ ] #37 - [...

documentation

Significance analysis for clustering with single-cell RNA-sequencing data https://www.nature.com/articles/s41592-023-01933-9

enhancement

- Current implementation (clusterCrit) is fast on it's own but does not reuse distance matrices that could be determined only once. - Only euclidean metric is supported, extension to support...

enhancement

- consider Variation of Information (VI) and Split/Join: https://stats.stackexchange.com/questions/24961/comparing-clusterings-rand-index-vs-variation-of-information

enhancement

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.trustworthiness.html#sklearn.manifold.trustworthiness determine (if computational feasible) trustworthiness for every embedding and provide it in the results

enhancement