Tobias Pitters comments

Results 199 comments of


                                            Tobias Pitters

Add instruction to reverse augmentation

Ok, with the merge of #2870 some things changed here. Sorry for this but things move quickly here. We need to implement this for the [`get_formatted` method of `DatasetEntry` class](https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_training/custom_datasets/formatting.py#L84)....

Implement the function to report messages generated in chats

I agree we should be capable to flag content.

Suggestion: Proposal for a curated Dataset for problem solving and coding.

That sounds interesting, especially since our models are not really good at coding, does anyone has the time to add this?

[fix] fix hf_summary

I would appreciate a test for this

Assistant create pedophile story describing child abuse

Regardless of the terms of service this should not happen! We should probably add more safety datasets, especially anything that helps the model to respond well in chats about abuse.

Arxiv: Research papers

This might be a possible duplicate: https://github.com/LAION-AI/Open-Assistant/issues/1927 Also note that tools like pdfplumber or textract can be used for this task

Special tokens in datasets

@olliestanley We encountered that our model believes it is created by openai because of the statements you mentioned. The linked PR removes this (from the 14 datasets mentioned in the...

Integrate with Ray for serving/training?

Thanks for reaching out @richardliaw. Currently we are using deepspeed for training our models. Could elaborate a bit on the differences and to deepspeed and the advantages of ray or...

Use Cerebras-GPT for fine tuning.

@djaym7 just checked in our weights&biases project and I did not find any runs with cerebras gpt. Is testing this model still something we consider @andreaskoepf ?

中文支持不好，乱码

please raise your concerns in English and give a bit more background on them. Otherwise I'll just close them.