Chansung Park

Results 145 comments of Chansung Park

Yeah kind of I will be more focusing on exploring how different combinations of hyper-params at generation time will effect the quality and speed!

Thanks @gururise I am retraining with 13B and 30B at the same time. Will share if I find something useful

"continue" works in 13B. That was something didn't work with 7B model in my case. currently finetuning 13B model with Korean instruction datasets to see how well it works with...

I have cheked that 30B model fine-tuned with the clean dataset hosted in this repo seems to have much better capability to answer in different languages. But, I have seen...

yea the one `gpt-3.5-turbo`

with 30B model, I have experienced the following conversations: 1. `continue` when the output is omitted. 2. code refactoring 3. reformatting text into markdown format (just simple list-up to bullet...

my bad i am using too many vms, i was confused

Not sure about web ui project. Just I have built this for real serving need. Batch request processing and some handy buttons like summarize and continue to overcome low resource...