ApolloRay
ApolloRay
At present, judging from the test results, the Chinese version does not match the propaganda effect in the paper, and there is a large difference.
I change the conv_mode from llava_v1 to chatml_direct. It works, but I can't get the same result as the official demo. 
I reproduce the same model, but I don't meet the same problem. I guess maybe the transformer version ? I got transformer==4.37.2
And i'm not sure why it has a "1" at the end of model path. 

Test in official demo (LLaVa-1.6-34B)
Have you found that the size of onnx model for int8 is much bigger than fp16 ? 
> > Have you found that the size of onnx model for int8 is much bigger than fp16 ? > >  > > yes, The onnx...
> update pytorch 2.0, it works, but the speed improvement is very small. [I] Running StableDiffusionXL pipeline |-----------------|--------------| | Module | Latency | |-----------------|--------------| | CLIP | 2.59 ms |...
> > > update pytorch 2.0, it works, but the speed improvement is very small. [I] Running StableDiffusionXL pipeline |-----------------|--------------| | Module | Latency | |-----------------|--------------| | CLIP | 2.59...