scvi-tools icon indicating copy to clipboard operation
scvi-tools copied to clipboard

ValueError with NaN tensors in gimVI training

Open sunericd opened this issue 1 year ago • 12 comments

When training the gimVI model, I am running into a ValueError in the first epoch for some datasets and not others. In all cases, the inputs are AnnData objects with raw counts for both the RNAseq and for the spatial data (dtype=float64). I have tried filtering out cells with zero counts and/or normalizing the RNAseq data but am still running into the same error. Strangely, gimVI seems to be able to train successfully on one dataset but when I remove 4 genes from the spatial data (32 -> 28 genes), it fails on that dataset. Happy to share data if that would be helpful and if there are suggestions for doing so (screenshots of basic data info below).

    import scvi
    from scvi.external import GIMVI
    
    # preprocessing of data
    spatial_adata = spatial_adata[:, spatial_adata.var_names.isin(RNAseq_adata.var_names)]
    
    # indices for filtering out zero-expression cells
    filtered_cells_spatial = (spatial_adata.X.sum(axis=1) > 1)
    filtered_cells_RNAseq = (RNAseq_adata.X.sum(axis=1) > 1)
    
    # make copies of subsets
    spatial_adata = spatial_adata[filtered_cells_spatial,:].copy()
    RNAseq_adata = RNAseq_adata[filtered_cells_RNAseq,:].copy()
    
    # setup anndata for scvi
    GIMVI.setup_anndata(spatial_adata)
    GIMVI.setup_anndata(RNAseq_adata)
        
    # train gimVI model
    model = GIMVI(RNAseq_adata, spatial_adata, **kwargs)
    model.train(200)
ValueError: Expected parameter loc (Tensor of shape (128, 10)) of distribution Normal(loc: torch.Size([128, 10]), scale: torch.Size([128, 10])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0')

Versions:

0.20.3

adata32

adata28

sunericd avatar Jul 28 '23 07:07 sunericd