elfisworking
elfisworking
python sdk has supported precision control control when search. That feature should be added to java sdk.
修复了print函数在python3下无法使用的问题
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答? | Is there an...
i try to use QAT to quantize qwen2 1.5B model The error raise from function `training.load_from_full_model_state_dict( model, model_state_dict, self._device, self._is_rank_zero, strict=True )` from recipes/qat_distributed Then i find error caused by...
today, i try to use Int4WeightOnlyQATQuantizer to quantize llama3-8b when i use model generate function, i get below error: ``` Running InferenceRecipe with resolved config: chat_format: null checkpointer: _component_: torchtune.training.FullModelTorchTuneCheckpointer...
I'm using torchtune for model quantization with QAT. Currently, I am learning based on https://pytorch.org/torchtune/main/tutorials/qat_finetune.html, but the results of the prepared_model I printed are different from those in the link....
i get a quantized model using torchtune package The test log show me: INFO:torchtune.utils._logging:Time for inference: 66.56 sec total, 4.51 tokens/sec 4.51 tokens/sec is even lower than that of the...