Xiao-Yong Jin
Xiao-Yong Jin
The vicuna v1.1 model used a different setup. See https://github.com/lm-sys/FastChat/blob/f85f489f2d5e48c37cceb2f00c3edc075c5d3711/fastchat/conversation.py#L115-L124 and https://github.com/lm-sys/FastChat/blob/f85f489f2d5e48c37cceb2f00c3edc075c5d3711/fastchat/conversation.py#L37-L44 IIUC, the prompt in Borne shell string is `"$system USER: $instruction ASSISTANT:"`. Their doc says this https://github.com/lm-sys/FastChat/blob/f85f489f2d5e48c37cceb2f00c3edc075c5d3711/docs/weights_version.md#example-prompt-weight-v11 I...
Downloading `LICENSE`, `USE_POLICY.md`, `tokenizer.model`, and `tokenizer_checklist.chk` are fine, but downloading any model specific files gives 403 forbidden.
What is the status for the MPI_Test blocking issue with inter-node comms?
I though this line controls the size, no? https://github.com/lattice/quda/blob/aa2ea419ce0f6f78f842f85f40cb2a607944c957/include/targets/sycl/target_device.h#L196
In their code, the chat format is here: https://github.com/meta-llama/llama3/blob/299bfd8212fec65698c2f8c7b5970cbbb74c2a4f/llama/tokenizer.py#L202
The instruct models need the `tokenizer.ggml.eos_token_id` to be 128009, or ``.
It seems the model generates `` with the official chat template. Otherwise it may generate ``.
This breaks backward compatibility. If this is acceptable, we need to double check our unit tests and fix the non-FUELCompat tests that uses Gaussian random numbers.
There is the no speech token that currently whisper.cpp ignores https://github.com/ggerganov/whisper.cpp/blob/447d49530c9af41fe24f2ae510f452903dba330d/whisper.cpp#L4592 Actually implement no speech threshold similar to openai/whisper might help.
Most of the time we don't really want much output, especially when called from a third party app. Depending on the usage, I would be happy if we can have...