Ditto P S

Results 55 comments of Ditto P S

It's happening because of the GrammerlessEngine. The chat template is restricted in GrammerlessEngine. Is there any specific accuracy issue If I extend the class to remove that criterion?

Thanks for the detailed response. I have tried this hack, but now I'm getting the below error. ``` id: token for token, id in tokenizer.get_added_vocab().items() ^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'GrammarlessTokenizer' object has...

Thank you for the update, I will go with that for now.

Same here, I got the issue while using "microsoft/Phi-3-medium-4k-instruct"

I'm facing the same issue with v3.0.0. The controller is not triggering the adapter load for the new pods.

The optimization calculation happened for some time, and then it started skipping with logging insufficient data. After that, the replica count went back to 0. Please see the logs below....

Yes, you are right, I have 2 deployments in the cluster, and I'm testing only llama for the scaling. This is the current log, where both models are not skipping...

$inf is for the deployment bud-qwen2-47f5298d, which doesn't have a profile. But if you look at llama-3-1-8b-instruct, it is skipping optimization. Is there a doc on how the optimizer works?...

Have used the following logic for Adapter loading/unloading 1. The pods are selected with Label-Based Matching - adapter.model.aibrix.ai/enabled: "true" 2. The selected pods are added to the Status.Instances list 3....

scheulePod was used to pick one pod and then assigned to instance.Status.Instances. Instead of choosing one pod, the new approach uses ALL pods that match the selector and adds all...