cNMF
cNMF copied to clipboard
How to adjust the convergence limit for the underlying sklearn NMF when using the command-line version of cNMF
Hello Dylan, I was wondering how you would adjust the convergence limit for the underlying NMF implementation from the sklearn
package, so that the following warning goes away:
sklearn/decomposition/_nmf.py:1641: ConvergenceWarning: Maximum number of iterations 1000 reached. Increase it to improve convergence. ConvergenceWarning,
Is it possible to do from within command-line? Or it can only be done when using python interactively?
The cnmf -h
gives lots of options, but none of them seem to be related:
cnmf -h
usage: cnmf [-h] [--name [NAME]] [--output-dir [OUTPUT_DIR]] [-c COUNTS]
[-k COMPONENTS [COMPONENTS ...]] [-n N_ITER]
[--total-workers TOTAL_WORKERS] [--seed SEED]
[--genes-file GENES_FILE] [--numgenes NUMGENES] [--tpm TPM]
[--beta-loss {frobenius,kullback-leibler,itakura-saito}]
[--init {random,nndsvd}] [--densify] [--worker-index WORKER_INDEX]
[--local-density-threshold LOCAL_DENSITY_THRESHOLD]
[--local-neighborhood-size LOCAL_NEIGHBORHOOD_SIZE]
[--show-clustering]
{prepare,factorize,combine,consensus,k_selection_plot}
positional arguments:
{prepare,factorize,combine,consensus,k_selection_plot}
optional arguments:
-h, --help show this help message and exit
--name [NAME] [all] Name for analysis. All output will be placed in
[output-dir]/[name]/...
--output-dir [OUTPUT_DIR]
[all] Output directory. All output will be placed in
[output-dir]/[name]/...
-c COUNTS, --counts COUNTS
[prepare] Input (cell x gene) counts matrix as df.npz
or tab delimited text file
-k COMPONENTS [COMPONENTS ...], --components COMPONENTS [COMPONENTS ...]
[prepare] Numper of components (k) for matrix
factorization. Several can be specified with "-k 8 9
10"
-n N_ITER, --n-iter N_ITER
[prepare] Numper of factorization replicates
--total-workers TOTAL_WORKERS
[all] Total number of workers to distribute jobs to
--seed SEED [prepare] Seed for pseudorandom number generation
--genes-file GENES_FILE
[prepare] File containing a list of genes to include,
one gene per line. Must match column labels of counts
matrix.
--numgenes NUMGENES [prepare] Number of high variance genes to use for
matrix factorization.
--tpm TPM [prepare] Pre-computed (cell x gene) TPM values as
df.npz or tab separated txt file. If not provided TPM
will be calculated automatically
--beta-loss {frobenius,kullback-leibler,itakura-saito}
[prepare] Loss function for NMF.
--init {random,nndsvd}
[prepare] Initialization algorithm for NMF.
--densify [prepare] Treat the input data as non-sparse
--worker-index WORKER_INDEX
[factorize] Index of current worker (the first worker
should have index 0)
--local-density-threshold LOCAL_DENSITY_THRESHOLD
[consensus] Threshold for the local density filtering.
This string must convert to a float >0 and <=2
--local-neighborhood-size LOCAL_NEIGHBORHOOD_SIZE
[consensus] Fraction of the number of replicates to
use as nearest neighbors for local density filtering
--show-clustering [consensus] Produce a clustergram figure summarizing
the spectra clustering