cNMF icon indicating copy to clipboard operation
cNMF copied to clipboard

How to adjust the convergence limit for the underlying sklearn NMF when using the command-line version of cNMF

Open erzakiev opened this issue 11 months ago • 0 comments

Hello Dylan, I was wondering how you would adjust the convergence limit for the underlying NMF implementation from the sklearn package, so that the following warning goes away:

sklearn/decomposition/_nmf.py:1641: ConvergenceWarning: Maximum number of iterations 1000 reached. Increase it to improve convergence.  ConvergenceWarning,

Is it possible to do from within command-line? Or it can only be done when using python interactively?

The cnmf -h gives lots of options, but none of them seem to be related:

cnmf -h
usage: cnmf [-h] [--name [NAME]] [--output-dir [OUTPUT_DIR]] [-c COUNTS]
            [-k COMPONENTS [COMPONENTS ...]] [-n N_ITER]
            [--total-workers TOTAL_WORKERS] [--seed SEED]
            [--genes-file GENES_FILE] [--numgenes NUMGENES] [--tpm TPM]
            [--beta-loss {frobenius,kullback-leibler,itakura-saito}]
            [--init {random,nndsvd}] [--densify] [--worker-index WORKER_INDEX]
            [--local-density-threshold LOCAL_DENSITY_THRESHOLD]
            [--local-neighborhood-size LOCAL_NEIGHBORHOOD_SIZE]
            [--show-clustering]
            {prepare,factorize,combine,consensus,k_selection_plot}

positional arguments:
  {prepare,factorize,combine,consensus,k_selection_plot}

optional arguments:
  -h, --help            show this help message and exit
  --name [NAME]         [all] Name for analysis. All output will be placed in
                        [output-dir]/[name]/...
  --output-dir [OUTPUT_DIR]
                        [all] Output directory. All output will be placed in
                        [output-dir]/[name]/...
  -c COUNTS, --counts COUNTS
                        [prepare] Input (cell x gene) counts matrix as df.npz
                        or tab delimited text file
  -k COMPONENTS [COMPONENTS ...], --components COMPONENTS [COMPONENTS ...]
                        [prepare] Numper of components (k) for matrix
                        factorization. Several can be specified with "-k 8 9
                        10"
  -n N_ITER, --n-iter N_ITER
                        [prepare] Numper of factorization replicates
  --total-workers TOTAL_WORKERS
                        [all] Total number of workers to distribute jobs to
  --seed SEED           [prepare] Seed for pseudorandom number generation
  --genes-file GENES_FILE
                        [prepare] File containing a list of genes to include,
                        one gene per line. Must match column labels of counts
                        matrix.
  --numgenes NUMGENES   [prepare] Number of high variance genes to use for
                        matrix factorization.
  --tpm TPM             [prepare] Pre-computed (cell x gene) TPM values as
                        df.npz or tab separated txt file. If not provided TPM
                        will be calculated automatically
  --beta-loss {frobenius,kullback-leibler,itakura-saito}
                        [prepare] Loss function for NMF.
  --init {random,nndsvd}
                        [prepare] Initialization algorithm for NMF.
  --densify             [prepare] Treat the input data as non-sparse
  --worker-index WORKER_INDEX
                        [factorize] Index of current worker (the first worker
                        should have index 0)
  --local-density-threshold LOCAL_DENSITY_THRESHOLD
                        [consensus] Threshold for the local density filtering.
                        This string must convert to a float >0 and <=2
  --local-neighborhood-size LOCAL_NEIGHBORHOOD_SIZE
                        [consensus] Fraction of the number of replicates to
                        use as nearest neighbors for local density filtering
  --show-clustering     [consensus] Produce a clustergram figure summarizing
                        the spectra clustering

erzakiev avatar Mar 15 '24 15:03 erzakiev