SHARK icon indicating copy to clipboard operation
SHARK copied to clipboard

Remove hf_auth_token use

Open Abhishek-Varma opened this issue 1 year ago • 4 comments

-- This commit removes --hf_auth_token uses from vicuna.py. -- It adds llama2 models based on daryl49's HF.

Signed-off-by: Abhishek Varma [email protected]

Abhishek-Varma avatar Sep 06 '23 15:09 Abhishek-Varma

Currently marking it as draft since 13B and 70B paths need testing. CC: @powderluv

Abhishek-Varma avatar Sep 06 '23 15:09 Abhishek-Varma

If we only download the mlir we wouldn't hit the token right?

powderluv avatar Sep 06 '23 15:09 powderluv

If we only download the mlir we wouldn't hit the token right?

I did try doing that but during the run saw that we will hit that issue - because we're using tokenizers to decode each generated token. And this tokenizer is being instantiated as per the HF repo we use.

Abhishek-Varma avatar Sep 06 '23 15:09 Abhishek-Varma

If we only download the mlir we wouldn't hit the token right?

I did try doing that but during the run saw that we will hit that issue - because we're using tokenizers to decode each generated token. And this tokenizer is being instantiated as per the HF repo we use.

Even this would work since we're anyway blocking the IR generation. It'd then essentially download the tokenizer's config files from daryl149/llama-2-7b-hf and we already have the MLIR generated from meta-llama/Llama-2-7b-chat-hf.

I verified it on CPU for llama2 7B.

With this PR we don't need to maintain config files for tokenizer but we're changing the base HF repo and this would impact the workflow when the IR generation is given a green signal.

But with the other PR we only need to incur an overhead for maintaining the config files - keeping rest of the infra same.

Abhishek-Varma avatar Sep 08 '23 14:09 Abhishek-Varma