implicit_chain_of_thought
implicit_chain_of_thought copied to clipboard
Hello, Thanks for your nice repo. I noticed that MIXTURE_SIZE is set to 1 in your provided example command. `self.mixture_components = nn.Embedding(config.mixture_size, hidden_size)` I feel curious why mixture_size is not...
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
Researchers usually download the huggingface transformers model to local storage to avoid errors when they don't have a stable network connection. When using training or inference script, SSL error or...
This is the minimal code change to answer Prof. @da03 challenge. for adding the teacher states , I had a function called add_two_teacher_states on **line 57**. Although I believe compressing...
1. Resolve a warning caused by a missing logging method. 2. Fix a typo "positions_to_substitut" that still exists. Hi, Prof. Deng. I am creating this pull request to bring your...