pytorch-lightning
pytorch-lightning copied to clipboard
make test failing
Bug description
Hi Team,
make test
is failing and throwing following errors on multiple tests
Successfully installed pytorch-lightning-2.2.0rc0 torch-2.1.2
# run tests with coverage
python -m coverage run --source src/lightning/pytorch -m pytest src/lightning/pytorch tests/tests_pytorch -v
/Users/abhinav.singh/anaconda3/envs/lightning/lib/python3.10/site-packages/_pytest/config/__init__.py:328: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: helpconfig, Hook: pytest_cmdline_parse
ConftestImportFailure: ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library. (from /Users/abhinav.singh/Documents/pytorch-lightning/tests/tests_pytorch/conftest.py)
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
ImportError while loading conftest '/Users/abhinav.singh/Documents/pytorch-lightning/tests/tests_pytorch/conftest.py'.
tests/tests_pytorch/conftest.py:24: in <module>
import lightning.fabric
src/lightning/__init__.py:20: in <module>
from lightning.pytorch.callbacks import Callback # noqa: E402
src/lightning/pytorch/__init__.py:27: in <module>
from lightning.pytorch.callbacks import Callback # noqa: E402
src/lightning/pytorch/callbacks/__init__.py:14: in <module>
from lightning.pytorch.callbacks.batch_size_finder import BatchSizeFinder
src/lightning/pytorch/callbacks/batch_size_finder.py:26: in <module>
from lightning.pytorch.callbacks.callback import Callback
src/lightning/pytorch/callbacks/callback.py:22: in <module>
from lightning.pytorch.utilities.types import STEP_OUTPUT
src/lightning/pytorch/utilities/types.py:40: in <module>
from torchmetrics import Metric
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/__init__.py:22: in <module>
from torchmetrics import functional # noqa: E402
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/functional/__init__.py:14: in <module>
from torchmetrics.functional.audio._deprecated import _permutation_invariant_training as permutation_invariant_training
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/functional/audio/__init__.py:14: in <module>
from torchmetrics.functional.audio.pit import permutation_invariant_training, pit_permutate
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/functional/audio/pit.py:22: in <module>
from torchmetrics.utilities import rank_zero_warn
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/utilities/__init__.py:14: in <module>
from torchmetrics.utilities.checks import check_forward_full_state_property
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/utilities/checks.py:25: in <module>
from torchmetrics.metric import Metric
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/metric.py:30: in <module>
from torchmetrics.utilities.data import (
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/utilities/data.py:22: in <module>
from torchmetrics.utilities.imports import _TORCH_GREATER_EQUAL_1_12, _XLA_AVAILABLE
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchmetrics/utilities/imports.py:45: in <module>
_TORCHVISION_GREATER_EQUAL_0_8: Optional[bool] = compare_version("torchvision", operator.ge, "0.8.0")
../../anaconda3/envs/lightning/lib/python3.10/site-packages/lightning_utilities/core/imports.py:73: in compare_version
pkg = importlib.import_module(package)
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchvision/__init__.py:6: in <module>
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torchvision/_meta_registrations.py:164: in <module>
def meta_nms(dets, scores, iou_threshold):
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torch/_custom_ops.py:253: in inner
custom_op = _find_custom_op(qualname, also_check_torch_library=True)
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torch/_custom_op/impl.py:1076: in _find_custom_op
overload = get_op(qualname)
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torch/_custom_op/impl.py:1062: in get_op
error_not_found()
../../anaconda3/envs/lightning/lib/python3.10/site-packages/torch/_custom_op/impl.py:1052: in error_not_found
raise ValueError(
E ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library.
and following are system/env config.
- python version [3.10]
- os [Mac os ventura 13.2.1]
- hardware [mac m2 pro silicon]
not sure if I am doing something wrong
What version are you seeing the problem on?
master
How to reproduce the bug
test
Error messages and logs
FAILED tests/tests_pytorch/test_cli.py::test_lightning_cli_config_with_subcommand - ModuleNotFoundError: DistributionNotFound: The 'jsonargparse[signatures]>=4.26.1' distribution was not found and is required by the application. HINT:...
FAILED tests/tests_pytorch/checkpointing/test_legacy_checkpoints.py::test_legacy_ckpt_threading[1.2.10] - AssertionError: No checkpoints found in folder "/Users/abhinav.singh/Documents/pytorch-lightning/tests/legacy/checkpoints/1.2.10"
FAILED tests/tests_pytorch/loops/test_training_loop.py::test_fit_loop_done_log_messages - AssertionError: assert 'should_stop` was set' in ''
FAILED tests/tests_pytorch/loops/test_training_loop.py::test_should_stop_early_stopping_conditions_met[4-10-4-True-True-True] - AssertionError: assert ('`Trainer.fit` stopped: `trainer.should_stop` was set.' in 'INFO pytorch_lightning.utilities.rank_zero:rank_zero.py:53 GPU...
FAILED tests/tests_pytorch/models/test_restore.py::test_load_model_from_checkpoint[ValTestLossBoringModel] - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, mps:0 and cpu!
FAILED tests/tests_pytorch/models/test_hparams.py::test_hparams_save_yaml - NameError: name 'DictConfig' is not defined
and this is env/system config.
float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
FAILED tests/tests_pytorch/plugins/precision/test_double.py::test_double_precision[DoublePrecisionBoringModelNoForward] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
FAILED tests/tests_pytorch/plugins/precision/test_double.py::test_double_precision[DoublePrecisionBoringModelComplexBuffer] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
FAILED tests/tests_pytorch/serve/test_servable_module_validator.py::test_servable_module_validator_with_trainer - ValueError: You set `strategy=ddp_spawn` but strategies from the DDP family are not supported on the MPS accelerator. Either explicitly set `accelerat...
FAILED tests/tests_pytorch/strategies/launchers/test_multiprocessing.py::test_fit_twice_raises - ValueError: You set `strategy=ddp_spawn` but strategies from the DDP family are not supported on the MPS accelerator. Either explicitly set `accelerat...
FAILED tests/tests_pytorch/trainer/flags/test_env_vars.py::test_passing_env_variables_devices - lightning.fabric.utilities.exceptions.MisconfigurationException: You requested gpu: [0, 1]
FAILED tests/tests_pytorch/utilities/migration/test_utils.py::test_patch_legacy_imports_unified[local] - AssertionError: Should not import standalone package, all imports should be redirected to the unified package;
Environment
Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):
More info
No response
@asingh9530 Importing torchmetrics
and subsequently torchvision
fails. Can you maybe reinstall these packages and make sure you can import them?
The next torchmetrics release should avoid this problem thanks to https://github.com/Lightning-AI/torchmetrics/pull/2316
@awaelchli still failing even after manual installation.
@carmocca its still failing after taking fresh pull.
So you are saying that the problem persists with torchmetrics
manually installed from master? What if you uninstall torchvision
, does he issue go away?
@awaelchli In both cases it still persists
But how is it possible, you must be mixing something up there. If you actually fully uninstall torchvision
, then the above code path from the error you posted would not even trigger. If you look at the error closely, you see that the error appears inside the torchvision library on import.