aitextgen Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit()

Running example code with current package creates following errors:

cannot import name 'DeepSpeedPlugin' from 'pytorch_lightning.plugins - aitextgen.py line 14
cannot import name 'ProgressBarBase' from 'pytorch_lightning.callbacks.progress - train.py line 13
cannot import name '_TPU_AVAILABLE' from 'pytorch_lightning.utilities - train.py line 14 - fixed in #202
Runtime Error: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. - aitextgen.py line 752

The Runtime error suggests wrapping the user code in a main function as hinted here https://discuss.pytorch.org/t/runtimeerror-an-attempt-has-been-made-to-start-a-new-process-before-the-current-process-has-finished-its-bootstrapping-phase/145462

But I cannot confirm if this fixes the issue as the current code does not progress at all (Might also because ProgressBar is not the correct replacement for ProgressBarBase.

Would love to have your input if theses changes actually work!

Mar 24 '23 10:03 scorixear

After around 1 Hour of training the program finished correctly, although the progress bar seems to be broken grafik

Mar 24 '23 11:03 scorixear

Getting this error while executing the example

Mar 25 '23 16:03 vjarora1978

Getting this error while executing the example

yes I get the same error, I will investigate whats up

Mar 25 '23 16:03 scorixear

Getting this error while executing the example

seems like ProgressBarBase contained the "loss" tensor for version 1.8.6, but got removed in ProgressBar version 2.0.0 (the latest of pytorch lightning)

I replaced the metrics with the outputs loss value - this doesn't affect the training code at all, its just about the progress bar viewing current and average loss

Mar 25 '23 16:03 scorixear

this is a really helpful pull req, thanks a lot! however, i still get an error about the kwarg "gpus" being unkown in pytorch's argsparse.py? "gpus" seemed to be part of that trainer object thing in train.py, could you help?

TypeError                                 Traceback (most recent call last)
<ipython-input-11-341925ca7a1c> in <cell line: 1>()
----> 1 ai.train(file_name,
      2          line_by_line=False,
      3          from_cache=False,
      4          num_steps=3000,
      5          generate_every=300,

1 frames
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/argparse.py in insert_env_defaults(self, *args, **kwargs)
     67 
     68         # all args were already moved to kwargs
---> 69         return fn(self, **kwargs)
     70 
     71     return cast(_T, insert_env_defaults)

TypeError: Trainer.__init__() got an unexpected keyword argument 'gpus'

May 02 '23 14:05 rs-rud

@fictionFanKazuki

this is a really helpful pull req, thanks a lot! however, i still get an error about the kwarg "gpus" being unkown in pytorch's argsparse.py? "gpus" seemed to be part of that trainer object thing in train.py, could you help?

TypeError                                 Traceback (most recent call last)
<ipython-input-11-341925ca7a1c> in <cell line: 1>()
----> 1 ai.train(file_name,
      2          line_by_line=False,
      3          from_cache=False,
      4          num_steps=3000,
      5          generate_every=300,

1 frames
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/argparse.py in insert_env_defaults(self, *args, **kwargs)
     67 
     68         # all args were already moved to kwargs
---> 69         return fn(self, **kwargs)
     70 
     71     return cast(_T, insert_env_defaults)

TypeError: Trainer.__init__() got an unexpected keyword argument 'gpus'

Hm, not sure how to reproduce this. I have changed the "gpus" arguments to"num_nodes" in my latest commit. Maybe you haven't used the latest one there?

Otherwise there is probably a new version of pytorch_lightning that had more breaking changes. But i would need to know which version you have installed there and/or the full stack trace as I canot deciver where the utilities function was called from.

On my machine with my version of pytorch_lightning (2.0.0) it works. I will push a restricted requirements.txt shortly

May 07 '23 21:05 scorixear

Thanks for this! I merged these fixes into my custom fork of AITextGen, and it allowed me to upgrade to PL v2.0.4 successfully!

Jul 03 '23 02:07 Vectorrent