pytorch-lightning
pytorch-lightning copied to clipboard
dirpath isn't updated when logger chages dir after first run
Bug description
I'm using the great library https://github.com/SkafteNicki/pl_crossvalidate to cross validate in my project. The library is overriding some of the internal behavior of the trainer and the logs directory.
The checkpoint path is resolved once during the first fold then it is short circuited and therefore never resolving to the new fold directory.
I suggest moving some of the initialization into the setup method
What version are you seeing the problem on?
master
How to reproduce the bug
trainer = KFoldTrainer(
num_folds=training_args['folds'],
max_epochs=training_args['epochs'],
accelerator="gpu",
callbacks=[
ModelCheckpoint(
monitor=training_args['monitor_metric'],
save_top_k=1,
mode=training_args['metric_mode'],
verbose=True
)
]
)
Error messages and logs
No response
Environment
Current environment
- CUDA: - GPU: - NVIDIA GeForce RTX 3050 Laptop GPU - available: True - version: 11.8
- Lightning: - lightning: 2.3.1 - lightning-cloud: 0.5.37 - lightning-utilities: 0.11.3.post0 - pytorch-lightning: 2.0.4 - torch: 2.0.0 - torch-cluster: 1.6.1 - torch-geometric: 2.3.0 - torch-scatter: 2.1.1 - torchaudio: 2.0.0 - torchmetrics: 1.4.0.post0 - torchvision: 0.15.2a0
- Packages: - absl-py: 1.4.0 - addict: 2.4.0 - aiofiles: 22.1.0 - aiohttp: 3.8.5 - aiosignal: 1.3.1 - aiosqlite: 0.18.0 - albumentations: 1.3.1 - alphashape: 1.3.1 - anyio: 3.5.0 - appdirs: 1.4.4 - argon2-cffi: 21.3.0 - argon2-cffi-bindings: 21.2.0 - arrow: 1.2.3 - asttokens: 2.2.1 - async-timeout: 4.0.2 - attrs: 23.1.0 - azure-core: 1.29.4 - azure-identity: 1.14.0 - azure-storage-blob: 12.18.2 - babel: 2.11.0 - backcall: 0.2.0 - beautifulsoup4: 4.12.2 - binaryornot: 0.4.4 - bleach: 4.1.0 - blessed: 1.19.1 - blinker: 1.6.2 - bottleneck: 1.3.5 - branca: 0.6.0 - brotlipy: 0.7.0 - build: 0.10.0 - cachecontrol: 0.13.1 - cached-property: 1.5.2 - cachetools: 5.3.1 - certifi: 2024.6.2 - cffi: 1.16.0 - chardet: 5.2.0 - charset-normalizer: 3.3.0 - chex: 0.1.83 - cleo: 2.0.1 - click: 8.1.3 - click-log: 0.4.0 - click-plugins: 1.1.1 - cligj: 0.7.2 - colorama: 0.4.6 - coloredlogs: 15.0.1 - comm: 0.1.2 - configargparse: 1.5.3 - contourpy: 1.0.7 - cookiecutter: 2.4.0 - crashtest: 0.4.1 - croniter: 1.3.15 - cryptography: 41.0.4 - cycler: 0.11.0 - cython: 0.29.37 - daal: 2024.5.0 - daal4py: 2024.5.0 - dash: 2.9.1 - dash-core-components: 2.0.0 - dash-html-components: 2.0.0 - dash-table: 5.0.0 - dataclasses: 0.8 - dateutils: 0.6.12 - debugpy: 1.6.6 - decorator: 5.1.1 - deepdiff: 6.3.1 - defusedxml: 0.7.1 - dill: 0.3.6 - distlib: 0.3.7 - dm-tree: 0.1.8 - docopt: 0.6.2 - docutils: 0.20.1 - dulwich: 0.21.6 - easydict: 1.10 - elastic-transport: 8.4.1 - elasticsearch: 8.10.0 - entrypoints: 0.4 - exceptiongroup: 1.0.4 - executing: 1.2.0 - fastapi: 0.100.0 - fastjsonschema: 2.16.3 - filelock: 3.12.4 - fiona: 1.8.22 - flask: 2.2.3 - flatbuffers: 23.5.26 - flax: 0.6.1 - folium: 0.14.0 - fonttools: 4.39.2 - freetype-py: 2.4.0 - frozenlist: 1.4.0 - fsspec: 2023.6.0 - future: 1.0.0 - gdal: 3.5.3 - geomloss: 0.2.6 - geopandas: 0.12.2 - gmpy2: 2.1.2 - google-auth: 2.22.0 - google-auth-oauthlib: 1.0.0 - gpustat: 1.0.0 - grpcio: 1.54.2 - h11: 0.14.0 - h5py: 3.8.0 - hdbscan: 0.8.37 - html5lib: 1.1 - humanfriendly: 10.0 - idna: 3.4 - imageio: 2.31.1 - importlib-metadata: 6.8.0 - importlib-resources: 5.12.0 - iniconfig: 1.1.1 - inquirer: 3.1.3 - insightface: 0.7.3 - installer: 0.7.0 - ipykernel: 6.22.0 - ipython: 8.11.0 - ipython-genutils: 0.2.0 - ipywidgets: 8.0.4 - isodate: 0.6.1 - itsdangerous: 2.1.2 - jaraco.classes: 3.3.0 - jax: 0.4.13 - jaxlib: 0.4.12 - jedi: 0.18.2 - jeepney: 0.8.0 - jinja2: 3.1.2 - joblib: 1.2.0 - json5: 0.9.6 - jsonpatch: 1.32 - jsonpointer: 2.1 - jsons: 1.6.3 - jsonschema: 4.17.3 - jupyter-client: 8.1.0 - jupyter-core: 5.3.0 - jupyter-events: 0.6.3 - jupyter-server: 2.5.0 - jupyter-server-fileid: 0.9.0 - jupyter-server-terminals: 0.4.4 - jupyter-server-ydoc: 0.8.0 - jupyter-ydoc: 0.2.4 - jupyterlab: 3.6.3 - jupyterlab-pygments: 0.1.2 - jupyterlab-server: 2.22.0 - jupyterlab-widgets: 3.0.5 - keyring: 24.2.0 - kivy: 2.2.1 - kiwisolver: 1.4.4 - kneed: 0.8.2 - lazy-loader: 0.3 - lightning: 2.3.1 - lightning-cloud: 0.5.37 - lightning-utilities: 0.11.3.post0 - llvmlite: 0.40.1 - lockfile: 0.12.2 - lxml: 4.9.1 - mamba-gator: 5.2.0 - mapclassify: 2.5.0 - markdown: 3.4.4 - markdown-it-py: 2.2.0 - markupsafe: 2.1.2 - mat73: 0.60 - matplotlib: 3.8.4 - matplotlib-inline: 0.1.6 - mdurl: 0.1.0 - mistune: 0.8.4 - mkl-fft: 1.3.6 - mkl-random: 1.2.2 - mkl-service: 2.4.0 - ml-dtypes: 0.4.0 - more-itertools: 10.1.0 - mpi4py: 3.1.4 - mpmath: 1.3.0 - msal: 1.24.1 - msal-extensions: 1.0.0 - msgpack: 1.0.7 - multidict: 6.0.4 - munch: 2.5.0 - munkres: 1.1.4 - nbclassic: 0.5.5 - nbclient: 0.5.13 - nbconvert: 6.5.4 - nbformat: 5.7.0 - nest-asyncio: 1.5.6 - networkx: 3.1 - notebook: 6.5.4 - notebook-shim: 0.2.2 - numba: 0.57.1 - numexpr: 2.8.4 - numpy: 1.24.3 - nvidia-ml-py: 11.495.46 - oauthlib: 3.2.2 - onnx: 1.14.1 - onnxruntime-gpu: 1.16.0 - open3d: 0.17.0 - opencv-python-headless: 4.7.0.72 - opt-einsum: 3.3.0 - optax: 0.2.2 - ordered-set: 4.1.0 - orjson: 3.9.2 - packaging: 23.2 - pandas: 1.5.3 - pandocfilters: 1.5.0 - parso: 0.8.3 - patsy: 0.5.3 - pexpect: 4.8.0 - pickleshare: 0.7.5 - pillow: 9.4.0 - pip: 23.2.1 - pipreqs: 0.4.11 - pkginfo: 1.9.6 - pl-crossvalidate: 0.1.0 - platformdirs: 3.11.0 - plotly: 5.13.1 - pluggy: 1.0.0 - poetry: 1.6.1 - poetry-core: 1.7.0 - poetry-plugin-export: 1.5.0 - pooch: 1.4.0 - portalocker: 2.8.2 - pot: 0.9.0 - pretty-errors: 1.2.25 - prettytable: 3.9.0 - prometheus-client: 0.14.1 - prompt-toolkit: 3.0.38 - protobuf: 4.21.12 - psutil: 5.9.4 - ptyprocess: 0.7.0 - pure-eval: 0.2.2 - py: 1.11.0 - pyasn1: 0.4.8 - pyasn1-modules: 0.2.8 - pybind11: 2.11.1 - pycparser: 2.21 - pydantic: 1.10.10 - pydiffmap: 0.2.0.1 - pyglet: 1.5.27 - pygments: 2.14.0 - pygsp: 0.5.1 - pyjwt: 2.7.0 - pynndescent: 0.5.10 - pyopengl: 3.1.6 - pyopenssl: 23.2.0 - pyparsing: 3.0.9 - pyproj: 3.5.0 - pyproject-hooks: 1.0.0 - pyquaternion: 0.9.9 - pyrender: 0.1.45 - pyrsistent: 0.19.3 - pysocks: 1.7.1 - pytest: 7.4.0 - python-dateutil: 2.8.2 - python-editor: 1.0.4 - python-json-logger: 2.0.7 - python-multipart: 0.0.6 - python-slugify: 8.0.1 - pytorch-lightning: 2.0.4 - pytz: 2022.7.1 - pyu2f: 0.1.5 - pyyaml: 6.0 - pyzmq: 25.0.2 - qudida: 0.0.4 - rapidfuzz: 2.15.2 - readchar: 4.0.5.dev0 - requests: 2.31.0 - requests-oauthlib: 1.3.1 - requests-toolbelt: 1.0.0 - rfc3339-validator: 0.1.4 - rfc3986-validator: 0.1.1 - rich: 13.3.5 - rsa: 4.9 - rtree: 1.0.1 - scienceplots: 2.1.1 - scikit-image: 0.22.0 - scikit-learn: 1.3.0 - scikit-learn-intelex: 20230131.200013 - scipy: 1.10.1 - seaborn: 0.13.2 - secretstorage: 3.3.3 - send2trash: 1.8.0 - setuptools: 68.0.0 - shapely: 2.0.1 - shellingham: 1.5.3 - six: 1.16.0 - sniffio: 1.2.0 - soupsieve: 2.4 - stack-data: 0.6.2 - starlette: 0.27.0 - starsessions: 1.3.0 - statsmodels: 0.14.0 - sympy: 1.11.1 - tabulate: 0.9.0 - tbb: 2021.13.0 - tenacity: 8.2.2 - tensorboard: 2.13.0 - tensorboard-data-server: 0.7.0 - terminado: 0.17.1 - text-unidecode: 1.3 - threadpoolctl: 3.1.0 - tifffile: 2023.9.26 - tinycss2: 1.2.1 - tomli: 2.0.1 - tomlkit: 0.12.1 - toolz: 0.12.1 - torch: 2.0.0 - torch-cluster: 1.6.1 - torch-geometric: 2.3.0 - torch-scatter: 2.1.1 - torchaudio: 2.0.0 - torchmetrics: 1.4.0.post0 - torchvision: 0.15.2a0 - tornado: 6.2 - tqdm: 4.66.4 - traitlets: 5.9.0 - transforms3d: 0.4.1 - trimesh: 3.21.5 - triton: 2.0.0 - trove-classifiers: 2023.10.17 - typing-extensions: 4.5.0 - typish: 1.9.3 - umap-learn: 0.5.3 - urllib3: 2.0.7 - uvicorn: 0.22.0 - virtualenv: 20.24.5 - visdom: 0.2.4 - wcwidth: 0.2.6 - webencodings: 0.5.1 - websocket-client: 0.58.0 - websockets: 11.0.3 - werkzeug: 2.2.3 - wheel: 0.38.4 - widgetsnbextension: 4.0.5 - xyzservices: 2023.2.0 - y-py: 0.5.9 - yarg: 0.1.9 - yarl: 1.9.2 - ypy-websocket: 0.8.2 - zipp: 3.17.0
- System: - OS: Linux - architecture: - 64bit - ELF - processor: x86_64 - python: 3.9.16 - release: 5.15.153.1-microsoft-standard-WSL2 - version: #1 SMP Fri Mar 29 23:14:13 UTC 2024
More info
Expected behavior
When new fold is created, the checkpoint path should be changed to the new directory
Current behavior
The checkpoint path is resolved once during the first fold then it is short circuited and therefore never resolving to the new fold directory.