Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Evaluate Detoxify to filter out unwanted prompts

Open andreaskoepf opened this issue 2 years ago • 3 comments

Evaluate if unitaryai/detoxify could be used to automatically filter prompts (e.g. compute for all posts submitted to the db). Or whether it maybe could be used in a security-layer that filters input and output to a live assistant bot in production.

Please write a short report about your findings (or generate a ipynb), including the model sizes, GPU memory-requirements, inference performance, subjective opinion about the filtering quality (if possible provide some examples). Check if their license would allow us to use their model. Check how we could host the model (e.g. huggingface?).

andreaskoepf avatar Dec 19 '22 16:12 andreaskoepf

I would like to take on this issue, my plan is to provide a ipynb notebook containing comparison of different models that includes:

  1. Inference speed and memory usage
  2. Training speed and memory usage
  3. Tests of detoxify on some inputs with different levels and kinds of toxic language
  4. The name of the license and main bulletpoints of it
  5. Information about hosting options

SzymonOzog avatar Dec 27 '22 03:12 SzymonOzog

Maybe it seems that ideally, this would run on nvidia triton for fast inference right?

jqueguiner avatar Jan 01 '23 05:01 jqueguiner

The notebook and a readme were posted in this pr: https://github.com/LAION-AI/Open-Assistant/pull/176 If anyone has any feedback or ideas how to expand this work feel free to contact me

SzymonOzog avatar Jan 01 '23 18:01 SzymonOzog