Epoch error
Currently fitting the spectra model using use_cell_type=True, which runs successfully on 2 epochs. But leads to all NA output with epoch # > 20. Not sure what would be driving the error here.
# fit the model (We will run this with only 2 epochs to decrease runtime in this tutorial) model = spc.est_spectra(adata=adata, gene_set_dictionary=annotations, use_highly_variable=True, cell_type_key="blueprint_labels_mapped", use_weights=True, lam=0.1, # varies depending on dsata and gene sets, try between 0.5 and 0.001 delta=0.001, kappa=None, rho=0.001, use_cell_types=True, n_top_vals=50, label_factors=True, overlap_threshold=0.2, clean_gs = True, min_gs_num = 3, num_epochs=100 #here running only 2 epochs for time reasons, we recommend 10,000 epochs for most datasets )
Cell type labels in gene set annotation dictionary and AnnData object are identical Your gene set annotation dictionary is now correctly formatted.
Hi thank you for developing this tool,
I encountered the same problem, works for 2 epochs, but only NAs as results for 10000 epochs.
Hello, I spent a really long time trying to figure this out. I believe that this issue stems from improper preprocessing of the counts, and or something later down your workflow that affects the preprocessed counts. Either way the issue is the counts (i.e., adata.X). One way to fix or test it:
- Load up your original file (start from scratch)
- Follow the steps from Scanpy or your own workflow. Just do your basic filtering, and don’t do anything that affects the raw counts.
- When you get to the ‘Normalization’ step run
Saving count data
adata.layers["counts"] = adata.X.copy()
Normalizing to median total counts
sc.pp.normalize_total(adata)
Logarithmize the data
sc.pp.log1p(adata)
This will save your original adata.X counts, as well as normalize and log transform your adata.X count data (which is what you want, if you are not using Scran). 4. Determine the highly variable genes. 5. Run Spectra (assuming you have everything else). You may want to transfer over your cell types.
Overall, the counts (adata.X) is likely messed up and should be properly normalized and log transformed. You may need to figure out how you want to deal with that, but that seems to be what is causing this issue.
I would suggest borrowing and modifying the code from scBestPractices where they plot the original counts and the ‘Shifted logarithm’ just to check that everything is at should be.