KKY

Results 17 comments of KKY

It will be fixed soon. Now i am back and will: - Update code architecture - More SOTA model come from Paper or Kaggle Practice , likes TCN, SelfAttentiveModel and...

No problem, but you have to give me some time. I was stuck in a competion recently.

hi, Can you give me some detail about "Have not been able to get reasonable results with the Attention layers". I'll try what you said, look for the dataset validation...

Can you send the train & test dataset to me ? I will check it. Thomas Capelle 于2020年6月22日周一 下午4:26写道: > Actually, my model get's worse. > But don't really know...

move scale is used to standard normalize data in time window. It will be delete in future. Now i am back and will: - Update code architecture - More SOTA...

BNB 4-bit is a very useful feature. Many models don't have GPTQ or AWQ quantization versions, and it requires some hard work to quantize a large model using post-training methods....

After the release of Llama3, I only can play the 8B version with vLLM, and I have to switch to Ollama to run the 70B version.

Yes , build_chat_input will construct special token "" and concat it to tokenizer("你是谁?", return_tensors="pt").input_ids. The method for constructing chat prompt for different model varies, so vLLM should either adapt to...