unsupervised_analysis icon indicating copy to clipboard operation
unsupervised_analysis copied to clipboard

clustification: Benchmark clf-based clustering approach

Open sreichl opened this issue 2 years ago • 5 comments

look for clustering benchmark datasets (from various domains) to test the approach and put the result into the documentation) → Clustering benchmark papers

sreichl avatar Aug 25 '23 15:08 sreichl

  • Example data (requirement: high dimensional with more metadata)
    • digits toy data
    • PBMC3k pre processed single cell data with annotations from 10x genomics
    • ...

sreichl avatar Aug 26 '23 12:08 sreichl

check if scRNA-seq data from SCCAF Teichmann paper works: https://www.nature.com/articles/s41592-020-0825-9 specifically their benchmarking data: https://github.com/SCCAF/sccaf_example

sreichl avatar Sep 11 '23 10:09 sreichl

cellxgene: https://cellxgene.cziscience.com/ HCA Data portal: https://data.humancellatlas.org/

sreichl avatar Sep 14 '23 15:09 sreichl

Start "easy" with a small and a large PBMC data set i.e., very clearly defined "ground truth"

sreichl avatar Oct 03 '23 08:10 sreichl

compare to their scRNA-seq specific clustering approach (quite similar ie iterative RFs) https://www.biorxiv.org/content/10.1101/2024.01.18.576317v1.full

sreichl avatar Jan 31 '24 10:01 sreichl