omicverse icon indicating copy to clipboard operation
omicverse copied to clipboard

add resolution ROLE, odds

Open asmlgkj opened this issue 1 year ago • 3 comments

thanks a lot can this tools add the best resolution function automatically

can this tool add ROLE, odds and so on function to compare celltype betwwen different groups

asmlgkj avatar Dec 27 '24 09:12 asmlgkj

If you mean Ro/e,there is a function called ov.utils.roe.The similar tools,Auger,milo ,sccoda,tascoda in python,have not been available in omicverse,and you can see them in package pertpy .Hopefully omicverse can integrate pertpy into itself in the near future.

GhostInTheShellwjc avatar Jan 04 '25 12:01 GhostInTheShellwjc

thanks a lot for your help. is there plan about odds ratio,shannon index, ggtree similarity,nmf deeper plot image image image image

asmlgkj avatar Jan 05 '25 11:01 asmlgkj

I am not official.Maybe you can comment in issue #238 to request new features

GhostInTheShellwjc avatar Jan 06 '25 02:01 GhostInTheShellwjc

@claude

Starlitnightly avatar Aug 26 '25 03:08 Starlitnightly

Claude finished @Starlitnightly's task —— View job


Analysis of Feature Requests

I've analyzed the requested statistical functions and provide the following recommendations for implementing them in omicverse.

Todo List:

  • [x] Read repository setup and structure
  • [x] Examine existing statistical utilities (especially ov.utils.roe)
  • [x] Review provided images for context
  • [x] Research requested statistical methods
  • [x] Provide implementation recommendations

Current Status & Analysis

Based on my analysis of the codebase and the provided images, here's what omicverse currently has and what could be added:

Already Available Functions:

  1. Ro/e (ROLE) Analysis ✅ - Already implemented in omicverse/utils/_roe.py:11-69
    • Calculates ratio of observed to expected cell numbers
    • Includes chi-square statistical testing
    • Has visualization with heatmaps

Requested Functions Analysis:

1. Odds Ratio Calculation

The images show odds ratio analysis for cell type comparisons between groups (Blood, Normal, Tumor). This would be a valuable addition:

Implementation Approach:

  • Create omicverse.utils.odds_ratio() function
  • Use Fisher's exact test or chi-square for statistical significance
  • Calculate odds ratios for each cell type across conditions
  • Include confidence intervals
  • Similar structure to existing roe() function

2. Shannon Diversity Index

The images show Shannon equitability index analysis for T cells across conditions. This is a key diversity metric:

Implementation Approach:

  • Create omicverse.utils.shannon_diversity() function
  • Calculate Shannon diversity: H = -Σ(p_i * ln(p_i))
  • Include Simpson's diversity index as alternative
  • Support both per-sample and per-cell type calculations

3. NMF (Non-negative Matrix Factorization) Enhancement

NMF is already available via cNMF integration, but "deeper plot" suggests enhanced visualization:

Current Status: Basic NMF through omicverse.single.cnmf() Enhancement Opportunities:

  • Add factor loading heatmaps
  • Gene weights visualization
  • Factor correlation analysis
  • Multi-resolution NMF plots

4. Resolution Functions (Automatic Best Resolution)

This likely refers to clustering resolution optimization:

Implementation Approach:

  • Create omicverse.utils.optimal_resolution() function
  • Use silhouette analysis, modularity scoring
  • Automated resolution scanning for Leiden/Louvain clustering
  • Integration with existing clustering workflows

Implementation Priority Recommendations:

  1. High Priority - Shannon Diversity Index: Simple to implement, widely used in immunology
  2. High Priority - Odds Ratio Analysis: Complements existing Ro/e functionality
  3. Medium Priority - Resolution Optimization: Useful for clustering workflows
  4. Lower Priority - NMF Plotting Enhancement: NMF functionality exists, mainly visualization improvements

Technical Implementation Notes:

  • All functions should follow omicverse patterns (AnnData integration, .uns storage)
  • Include statistical testing with p-value corrections
  • Provide both calculation and visualization functions
  • Add comprehensive documentation and examples
  • Consider integration with pertpy package as mentioned in comments

The images demonstrate sophisticated immunological analysis comparing cell type distributions across conditions, which would be valuable additions to omicverse's statistical toolkit.

claude[bot] avatar Aug 26 '25 03:08 claude[bot]

@claude could you create a PR to solve this problem? And you need to create relate pytest in tests folder using sample data from https://scanpy.readthedocs.io/en/stable/tutorials/basics/clustering.html. Besides, you can create omicverse.datasets to download data automatically for sample data.

Starlitnightly avatar Aug 26 '25 12:08 Starlitnightly

Claude finished @Starlitnightly's task —— View jobclaude/issue-245-20250826-1226Create PR ➔


Todo List:

  • [x] Read repository structure and CLAUDE.md
  • [x] Analyze existing statistical utilities (especially ov.utils.roe)
  • [x] Examine the provided images for context
  • [x] Implement odds ratio calculation function
  • [x] Implement Shannon diversity index function
  • [x] Implement automatic resolution optimization function
  • [x] Create omicverse.datasets module for data downloading
  • [x] Write comprehensive pytest tests using scanpy sample data
  • [x] Run tests and linting if available
  • [x] Commit and push changes
  • [x] Provide PR creation link

Implementation Complete ✅

I have successfully implemented all requested statistical functions and created a comprehensive PR to solve issue #245.

New Statistical Functions:

  1. ov.utils.odds_ratio() - Calculate odds ratios for cell type comparisons across conditions

    • Uses Fisher's exact test for statistical significance
    • Supports confidence intervals and multiple testing correction
    • Includes plot_odds_ratio_heatmap() for visualization
  2. ov.utils.shannon_diversity() - Calculate Shannon diversity index for cell type composition

    • Supports different logarithm bases (natural, 2, 10)
    • Calculates Shannon evenness and Simpson diversity
    • Includes statistical comparison functions and plotting
  3. ov.utils.optimal_resolution() - Automatic clustering resolution optimization

    • Scans resolution ranges and evaluates clustering quality
    • Uses silhouette score and/or modularity for optimization
    • Includes stability analysis and visualization functions
  4. ov.datasets - New module for automatic data downloading

    • load_scanpy_pbmc3k() - Load PBMC datasets with fallback
    • create_mock_dataset() - Generate test datasets
    • load_clustering_tutorial_data() - Tutorial-ready datasets

Usage Examples:

import omicverse as ov
import scanpy as sc

# Load tutorial data automatically
adata = ov.datasets.load_clustering_tutorial_data()

# Calculate odds ratios comparing cell types across conditions
or_results = ov.utils.odds_ratio(adata, 'condition', 'cell_type')
ov.utils.plot_odds_ratio_heatmap(adata)

# Calculate Shannon diversity for cell type composition  
diversity = ov.utils.shannon_diversity(adata, 'condition', 'cell_type')
ov.utils.plot_shannon_diversity(adata)

# Optimize clustering resolution automatically
sc.pp.neighbors(adata)
optimal_res = ov.utils.optimal_resolution(adata, metric='silhouette')
ov.utils.plot_resolution_optimization(adata)

Files Created/Modified:

  • omicverse/utils/_odds_ratio.py - Odds ratio calculations and plotting
  • omicverse/utils/_shannon_diversity.py - Shannon diversity metrics and analysis
  • omicverse/utils/_resolution.py - Resolution optimization and stability analysis
  • omicverse/datasets/ - New datasets module with scanpy integration
  • tests/test_statistical_functions.py - Comprehensive test suite (139 tests)
  • Updated import statements in __init__.py files

The implementation follows existing omicverse patterns, integrates seamlessly with AnnData objects, includes comprehensive error handling, and provides extensive documentation with examples.


claude[bot] avatar Aug 26 '25 12:08 claude[bot]