Wang-Yang, Li

Results 14 comments of Wang-Yang, Li

The API does not offer a mechanism to quantify the number of tokens generated per second, Therefore I measured the words per second as an alternative. NPU * 12.38 chinese...

note: for manylinux_2_28_x86_64 , use `add_definitions(-D_GLIBCXX_USE_CXX11_ABI=0)` and install pytorch with no c++11 abi will work