ecosystem-ci icon indicating copy to clipboard operation
ecosystem-ci copied to clipboard

Add PyKEEN to ecosystem-ci

Open mberr opened this issue 3 years ago • 7 comments

Before submitting

  • [ ] Was this discussed/approved via a GitHub issue? (no need for typos and docs improvements)
  • [x] Did you create/update your configuration file?
  • [x] Did you set runtimes in config for GitHub action integration?
  • [x] Did you add your config to CI in Azure pipeline (only projects with 100+ GitHub stars)?
  • [ ] Are all integration tests passing?

What does this PR do? [optional]

Project: https://github.com/pykeen/pykeen

PyL integration via

  • https://github.com/pykeen/pykeen/pull/905
  • https://github.com/pykeen/pykeen/pull/917
  • https://github.com/pykeen/pykeen/pull/930

Did you have fun?

Make sure you had fun coding 🙃

mberr avatar May 19 '22 13:05 mberr

https://dev.azure.com/PytorchLightning/compatibility/_build/results?buildId=72651&view=logs&j=fb683405-d979-52da-6de9-2541dff429a6&t=bdee9137-b6d6-59ea-6392-0d699b7aef3e&l=12676

the errors seem to originate from a tqdm in Lightning-only code :thinking:

       torch.multiprocessing.spawn.ProcessRaisedException: 
E       
E       -- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/spawn.py", line 101, in _wrapping_function
    results = function(*args, **kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1234, in _run
    results = self._run_stage()
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1321, in _run_stage
    return self._run_train()
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_train
    self._run_sanity_check()
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1411, in _run_sanity_check
    val_loop.run()
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 154, in advance
    dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance
    self._on_evaluation_batch_start(**kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 249, in _on_evaluation_batch_start
    self.trainer._call_callback_hooks(hook_name, *kwargs.values())
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1634, in _call_callback_hooks
    fn(self, self.lightning_module, *args, **kwargs)
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/progress/tqdm_progress.py", line 291, in on_validation_batch_start
    self.val_progress_bar.reset(convert_inf(self.total_val_batches_current_dataloader))
  File "/opt/conda/lib/python3.8/site-packages/tqdm/std.py", line 1408, in reset
    self.last_print_t = self.start_t = self._time()
AttributeError: 'Tqdm' object has no attribute '_time'

@aniketmaurya any idea how to fix / investigate this issue?

mberr avatar May 24 '22 10:05 mberr

Seems like TQDM version compatibility issue. Cc: @Borda

aniketmaurya avatar May 26 '22 00:05 aniketmaurya

Codecov Report

Merging #50 (1adafc5) into main (21ddc52) will not change coverage. The diff coverage is n/a.

Additional details and impacted files
@@        Coverage Diff         @@
##           main   #50   +/-   ##
==================================
  Coverage    85%   85%           
==================================
  Files         2     2           
  Lines       230   230           
==================================
  Hits        196   196           
  Misses       34    34           

codecov[bot] avatar May 26 '22 05:05 codecov[bot]

Seems like TQDM version compatibility issue. Cc: @Borda

any updates on this?

mberr avatar Jul 31 '22 19:07 mberr

@mberr I am very sorry for the delay, but I ll take look at it this week as we are rolling some more updates :otter:

Borda avatar Jan 01 '23 12:01 Borda

@mberr I am very sorry for the delay, but I ll take look at it this week as we are rolling some more updates 🦦

@Borda no problem - great that this is now regaining momentum;

I accepted your proposed changes in 597f7c2...4cd6fae and merged with the current main branch.

mberr avatar Jan 01 '23 16:01 mberr

This error message looks strange to me: https://github.com/Lightning-AI/ecosystem-ci/actions/runs/3878585128/jobs/6627870005#step:9:11

The respective version is available through pypi: https://pypi.org/project/torch-max-mem/0.0.4/

This one seems to come from main branch: https://github.com/Lightning-AI/ecosystem-ci/actions/runs/3878585128/jobs/6627869826#step:12:1

mberr avatar Jan 10 '23 18:01 mberr