pytorch-lightning icon indicating copy to clipboard operation
pytorch-lightning copied to clipboard

Support `str(datamodule)`

Open carmocca opened this issue 4 years ago • 11 comments

🚀 Feature

Add support

print(str(MyDataModule()))

Motivation

It currently prints:

<__main__.MyDataModule object at 0x10284c970>

Pitch

It could print the DataLoader structure:

MyDataModule(
    train_dataloader: {"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader: [DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader: DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

Or the number of batches per dataloader, similar to what was done in https://github.com/PyTorchLightning/pytorch-lightning/issues/5965

Alternatives

Open to other ideas


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

  • Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

  • Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

carmocca avatar Oct 15 '21 12:10 carmocca

cc @kingyiusuen

carmocca avatar Oct 15 '21 12:10 carmocca

I could take care of this 👍

Abelarm avatar Oct 15 '21 15:10 Abelarm

cc @kingyiusuen

I am happy to let @Abelarm take it :)

kingyiusuen avatar Oct 15 '21 15:10 kingyiusuen

Hi guys I am currently at a problem between: Screenshot 2021-10-16 at 20 01 05

and

Screenshot 2021-10-16 at 20 20 09 *

*which is not consistent on the prints.

the problem is the str() of the dict :(

Do you have any idea? or one of the two solutions is good enough?

Abelarm avatar Oct 16 '21 18:10 Abelarm

Hi guys I am currently at a problem between: Screenshot 2021-10-16 at 20 01 05

and

Screenshot 2021-10-16 at 20 20 09 *

*which is not consistent on the prints.

the problem is the str() of the dict :(

Do you have any idea? or one of the two solutions is good enough?

if you really want the keys of the dict to be with "" I can do it but it won't be the nicest of the solutions

Abelarm avatar Oct 16 '21 18:10 Abelarm

Hey @Abelarm! You can open a draft PR so we can check your current implementation and discuss it.

carmocca avatar Oct 16 '21 18:10 carmocca

in the spirit of https://docs.python.org/3.4/reference/datamodel.html#object.repr

If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).

I recommend:

  1. keeping the quotes around dict keys but not dict values
  2. using an = after the name of initialization parameters instead of a :

Following these recommendations, @Abelarm 's test expression would become:

MyDataModule(
    train_dataloader={"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader=[DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader=DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

dmarx avatar Oct 28 '21 07:10 dmarx

Hey @carmocca,

I believe adding support for str() provides the same inconvenient as using len().

It might be worth to consider a describe LightningDataModule method instead.

Best, T.C

tchaton avatar Nov 01 '21 14:11 tchaton

The main reason for the revertion of len was the impact to existing truthiness checks. That should not be a problem for str.

@ananthsub do you think the rest of the points you raised in https://github.com/PyTorchLightning/pytorch-lightning/issues/5965#issuecomment-948862064 are worth dropping this feature? We would still have the problem of initialization.

carmocca avatar Nov 02 '21 13:11 carmocca

in the spirit of https://docs.python.org/3.4/reference/datamodel.html#object.repr

If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).

I recommend:

  1. keeping the quotes around dict keys but not dict values
  2. using an = after the name of initialization parameters instead of a :

Following these recommendations, @Abelarm 's test expression would become:

MyDataModule(
    train_dataloader={"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader=[DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader=DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

in my pr I already go : instead of = but I am struggling to add "" around keys dict :(

Abelarm avatar Nov 18 '21 20:11 Abelarm

It seems like this feature is still not implemented. Would it be possible to work in this issue?

MrWhatZitToYaa avatar Sep 19 '24 18:09 MrWhatZitToYaa