cNMF icon indicating copy to clipboard operation
cNMF copied to clipboard

no top_genes in outputs

Open DiegoSafian opened this issue 1 year ago • 4 comments

Hi,

I am running version 4.1 in command line and I am getting all the output, except for top_genes. Do you know if there is something I can do to obtain it?? cnmf.txt

Kind regards, Diego

DiegoSafian avatar Apr 12 '24 14:04 DiegoSafian

Hi Diego, currently top_genes is created on the fly with the cnmf_obj.load_results() function in the Python environment and isn't created in any of the functions run from the command line. I'll consider adding something like the top_genes output to the consensus step in the future.

dylkot avatar May 07 '24 19:05 dylkot

Hi Dylan, thank you for creating such a fantastic tool. If you do have time, adding top_genes output for the command line would be really helpful! Thank you so much!

blain1995 avatar Sep 05 '24 19:09 blain1995

Yes, hopefully I'll get to this soon. In the mean time, the python code to get this is below and I asked chatgpt to convert to R and it gave me the code below that:

spectra_scores = pd.read_csv(spectra_scores_file, sep='\t', index_col=0)
n_top_genes = 50
top_genes = []
for gep in spectra_scores.columns:
    top_genes.append(list(spectra_scores.sort_values(by=gep, ascending=False).index[:n_top_genes]))
        
top_genes = pd.DataFrame(top_genes, index=spectra_scores.columns).T
# Load required libraries
library(readr)
library(dplyr)

# Read the spectra scores from the file
n_top_genes <- 50
spectra_scores <- read_tsv(spectra_scores_file, col_names = TRUE)

# Initialize an empty list to store top genes
top_genes <- list()

# Loop through each column (excluding the first column, which is typically row names)
for (gep in colnames(spectra_scores)) {
  # Sort values by column, extract top n genes
  top_genes[[gep]] <- spectra_scores %>%
    arrange(desc(!!sym(gep))) %>%
    slice(1:n_top_genes) %>%
    pull(1)  # Pull the first column (assuming it's the row index or gene name)
}

# Convert the list of top genes to a data frame and transpose it
top_genes_df <- as.data.frame(do.call(cbind, top_genes))
colnames(top_genes_df) <- colnames(spectra_scores)

# Optionally, transpose the data frame
top_genes_df <- t(top_genes_df)

dylkot avatar Sep 06 '24 00:09 dylkot

Thank you very much for your quick and detailed response!

blain1995 avatar Sep 06 '24 14:09 blain1995