aotsuki
aotsuki
WebGLM/model/retriever/filtering/contriver.py:36: UserWarning: MPS: no support for int64 reduction ops, casting it to int32 (Triggered internally at /Users/runner/miniforge3/conda-bld/pytorch-recipe_1680607563975/work/aten/src/ATen/native/mps/operations/ReduceOps.mm:144.) dim=1) / mask.sum(dim=1)[..., None] loc("varianceEps"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/ff32e6fb-db00-11ed-a068-428477786501/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor' and 'tensor' are...
fix local llm always using CUSTOM as api token
**例行检查** [//]: # '方框内填 x 表示打钩' - [x] 我已确认目前没有类似 features - [x] 我已确认我已升级到最新版本 - [x] 我已完整查看过项目 README,已确定现有版本无法满足需求 - [x] 我理解并愿意跟进此 features,协助测试和提供反馈 - [x] 我理解并认可上述内容,并理解项目维护者精力有限,**不遵循规则的 features 可能会被无视或直接关闭** **功能描述** 目前在本地尝试超长上下文的场景,因为属于技术验证,资源优先,所以需要等到超过15分钟才能得到llm计算返回,希望能够通过环境变量更改全局超时时间。 **应用场景** 超长上下文推理计算...
macos ventura 运行 cmake .. make clang: error: unsupported option '-fopenmp' make[2]: *** [CMakeFiles/tts_test.dir/test/main.cpp.o] Error 1 make[1]: *** [CMakeFiles/tts_test.dir/all] Error 2 make: *** [all] Error 2 尝试brew install gcc 后在shell...
**LocalAI version:2.16.0 **Environment, CPU architecture, OS, and Version:** mac studio M2 Ultra **Describe the bug** using backend transformers for glm4, trust_remote_code: true not correctly used by backend/python/transformers/backend.py **To Reproduce** **Expected...