GLM-130B icon indicating copy to clipboard operation
GLM-130B copied to clipboard

[Disscussion] Can we align GLM-130B to human like chatgpt?

Open AnShengqiang opened this issue 2 years ago • 5 comments

AnShengqiang avatar Dec 10 '22 15:12 AnShengqiang

Certainly. The alignment for GLM-130B could be important, and we are on preliminary surveying.

Xiao9905 avatar Dec 15 '22 09:12 Xiao9905

You could use the current glm-10b on huggingface with trl/trlx to construct a model with rlhf.

conceptofmind avatar Jan 05 '23 01:01 conceptofmind

What is trl/trlx? I am very interested in this use case. Why must the 10-b parameter model be used for rlhf?

smeyerhot avatar Jan 08 '23 06:01 smeyerhot

I am actively working on this task and would be very interested in further development coordination.

smeyerhot avatar Jan 08 '23 06:01 smeyerhot

@smeyerhot Trl is Transformer Reinforcement Learning a library built by Huggingface for training language models with PPO. Trlx is an extension of Trl built by CarperAI. Both cover the same use-case for training models using reinforcement learning with human feedback. You can also build the same functionality with actor-critic ppo in PyTorch although it would require more extensive domain knowledge. You do not have to use glm-10b but it is publicly available on Huggingface's model hub unlike 130b which requires you to apply for access. You can use any encoder-decoder or decoder-only model. We are on an issue relating to GLM for aligning human feedback with the model which is why I suggested using the 10b parameter one.

conceptofmind avatar Jan 08 '23 07:01 conceptofmind

chatgpt can generate the format text and image. this need to keep the pertaining data in original format

Syno8 avatar Feb 23 '23 08:02 Syno8

hi gays, I use bloom to implement ppo successfully. image But I found the Bloom model use the AutoModelForCausalLM function. image

however, the glm is using the AutoModelForSeq2SeqLM function. image there is no LM in AutoModelForSeq2SeqLM model, so do u know how to correct ?

beautifull4frank avatar Mar 10 '23 04:03 beautifull4frank