GLM-130B [Disscussion] Can we align GLM-130B to human like chatgpt?

[Disscussion] Can we align GLM-130B to human like chatgpt?

Open AnShengqiang opened this issue 2 years ago • 5 comments

Dec 10 '22 15:12 AnShengqiang

Certainly. The alignment for GLM-130B could be important, and we are on preliminary surveying.

Dec 15 '22 09:12 Xiao9905

You could use the current glm-10b on huggingface with trl/trlx to construct a model with rlhf.

Jan 05 '23 01:01 conceptofmind

What is trl/trlx? I am very interested in this use case. Why must the 10-b parameter model be used for rlhf?

Jan 08 '23 06:01 smeyerhot

I am actively working on this task and would be very interested in further development coordination.

Jan 08 '23 06:01 smeyerhot

@smeyerhot Trl is Transformer Reinforcement Learning a library built by Huggingface for training language models with PPO. Trlx is an extension of Trl built by CarperAI. Both cover the same use-case for training models using reinforcement learning with human feedback. You can also build the same functionality with actor-critic ppo in PyTorch although it would require more extensive domain knowledge. You do not have to use glm-10b but it is publicly available on Huggingface's model hub unlike 130b which requires you to apply for access. You can use any encoder-decoder or decoder-only model. We are on an issue relating to GLM for aligning human feedback with the model which is why I suggested using the 10b parameter one.

Jan 08 '23 07:01 conceptofmind

chatgpt can generate the format text and image. this need to keep the pertaining data in original format

Feb 23 '23 08:02 Syno8

hi gays, I use bloom to implement ppo successfully. But I found the Bloom model use the AutoModelForCausalLM function.

however, the glm is using the AutoModelForSeq2SeqLM function. there is no LM in AutoModelForSeq2SeqLM model, so do u know how to correct ?

Mar 10 '23 04:03 beautifull4frank

GLM-130B GLM-130B copied to clipboard

[Disscussion] Can we align GLM-130B to human like chatgpt?

GLM-130B
GLM-130B copied to clipboard