scvi-tools icon indicating copy to clipboard operation
scvi-tools copied to clipboard

Splitterdefault

Open mkarikom opened this issue 3 years ago • 1 comments

This allows DataSplitter and SemiSupervisedDataSplitter to pass kwargs to the DataLoader constructor in keeping with the DataSplitter API which states "**kwargs: Keyword args for data loader".

The original implementation didn't completely observe the above design, since passing drop_last as a kwarg to the Splitter caused an error due to the the hard-coded default of drop_last=3 in data_loader_kwargs.

In addition to allowing default values for all data_loader_kwargs to be defined under defaultKwargs, the main benefit of this refactoring is that DataSplitter is now transparent wrt the existing Lightning Data Loader API (like drop_last=True).

Because this refactoring does not affect supervised training, new tests added to tests/models/test_models.py::test_data_splitter run directly on DataSplitter, but not on SemiSupervisedDataSplitter (the implementation is identical)

mkarikom avatar Sep 21 '22 22:09 mkarikom

Codecov Report

Base: 90.46% // Head: 90.46% // Increases project coverage by +0.00% :tada:

Coverage data is based on head (76e5d8d) compared to base (9b8eb8a). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1706   +/-   ##
=======================================
  Coverage   90.46%   90.46%           
=======================================
  Files         130      130           
  Lines       10337    10339    +2     
=======================================
+ Hits         9351     9353    +2     
  Misses        986      986           
Impacted Files Coverage Δ
scvi/dataloaders/_data_splitting.py 97.02% <100.00%> (+0.03%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Sep 21 '22 23:09 codecov[bot]

@adamgayoso I'm happy to add more unit tests [for semi] if required, just lmk and thanks for looking!

mkarikom avatar Oct 06 '22 18:10 mkarikom

Hi @adamgayoso , I'm not sure what is going on with pytest throwing tons of errors. As a sanity check I've created a vanilla cuda dev container which replicates all pytest errors in my environment in the following steps:

  1. clone https://github.com/mkarikom/cuda_devcon
  2. open the cuda_devcon folder in vscode
  3. confirm "open folder in container"
  4. wait for build
  5. python -m pip install pytest && git clone --recurse-submodules [email protected]:scverse/scvi-tools.git && cd scvi-tools && pytest tests/models/test_models.py

Right now my project space is so heterogeneous that I rely heavily on containerized environments.

Do you have a similarly-portable testing environment that I could spin up to see what you are seeing wrt pytest? I tried (but failed) get the python_poetry environment working per the scvi-tools contributor instructions, so I still just use pip (in a docker container)

mkarikom avatar Oct 12 '22 14:10 mkarikom

I hit a run-time bug with the proposed default_data_loader_kwargs.update(kwargs) syntax so reverted to the previous splatting syntax

mkarikom avatar Oct 12 '22 16:10 mkarikom

Hey, @adamgayoso!
I think this is now current and all checks passing!

Sorry about the delayed changes!

A style issues in the pre-commit.ci that I can't see at all with my flake8/pylint plugins. The only other style issue was "leading _" but adding those led to undefined errors in the tests so I've ignored it.

Hope your last few weeks have been as fun as ours ;)

mkarikom avatar Dec 05 '22 23:12 mkarikom