Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

A notebook for question and answer generation using one of the most powerful opensource NLU models, FLAN-T5-11B.

Open Rallio67 opened this issue 2 years ago • 5 comments

This is code that can be run in a notebook or by itself to generate a dictionary for use in creating synthetic dialogue that can be verified for factual accuracy. To use this notebook your need your trusted source material to be in the format of a list of strings (they will be truncated to under 1100 characters). Requires transformers and accelerate. Make sure to use T5 with bfloat16 or full precision.

It would be nice if someone can convert this approach to work on colab. T5-11B should be able to run on TPU with colab.

Rallio67 avatar Dec 31 '22 21:12 Rallio67

could you a) run pre-commit to pass linting, and b) could you have a look at how the folder structure of notebooks/ looks like and make yours in the same way? doesn't have to be .ipynb obviously, but the accompanying short markdown file that informs people what a piece of code does is quite helpful

yk avatar Dec 31 '22 23:12 yk

It would be nice if someone can convert this approach to work on colab. T5-11B should be able to run on TPU with colab.

Colab Link Swapped down to -xl model by default for free tier users wanting to work with it. Can be changed to back to -xxl if you have pro available

TwoDukes avatar Jan 01 '23 08:01 TwoDukes

I think it should be fixed now. I ran pre-commit and added the .md file and changed to have correct folder structure. Let me know if this works now. TwoDukes you may want to use the changes I put in. There was some whitespace problems and a few unused variables that were left over from other things I was doing before. I actually forgot incorporating the logits score was some custom code I wrote. I think I implemented it correctly, but someone else checking that may be helpful.

Rallio67 avatar Jan 01 '23 23:01 Rallio67

  • you might need to add & commit again after running pre-commit if it was not all green. there are 2 things it does: auto-fixes (in which case the file has new changes & needs to be added again), or it yells at you, in which case you need to fix it (and also add the file again). pre-commit is currently still failing, so if you ran it, you might need to add & commit it again.
  • For the future, if you run pre-commit install once in the repo, it will do this automatically on every commit in the future
  • You might also think of making your code just a bit more interactive. It's not super important because code in the notebooks/ folder is meant as scrappy demonstrations of stuff, but an easy thing you could do is instead of making your scripts be top-level scripts, make them into typer scripts. typer makes it super easy to make a script into a cli, then all the constants that are just "somewhere in the code" can be made automatically into command line parameters, and a help text is generated, etc. etc. i.e. not urgent and not necessary, but it might even speed up your own development process :)

yk avatar Jan 02 '23 10:01 yk

@Rallio67 pre-commit still reports problems, can you fix and give it another try?

andreaskoepf avatar Jan 05 '23 12:01 andreaskoepf

Whats the GPU VRAM requirements for the XXL model?

gameveloster avatar Jan 12 '23 04:01 gameveloster

You need a 24 gigabyte card and you need to use bfloat16. RTX3090 and other ampere level cards works with 24 gigabyte memory like the A10, A100, A5000 etc.

Rallio67 avatar Jan 13 '23 00:01 Rallio67