Description

This PR relates to #939

Notes for Reviewers

Signed commits

[x] Yes, I signed my commits.

Oct 17 '23 02:10 Aisuko

cc @lu-zero

Oct 17 '23 16:10 mudler

The process of rust backend:

[x] The basic framework of Rust gRPC backend (Maybe still some issues, will fixed by other commits)
[ ] Implement with burn (Working on it now, their website was broken I already open issue on their repo and do more investigation)
[ ] candle (I see burn support candle as backend in alpha. So, let's implement the backend with burn first)

Really appreciate your help @lu-zero but still need your help on "burn" backend, thank you.

Oct 22 '23 00:10 Aisuko

An idea of choosing default burn backend for Rust backend #1219

Oct 26 '23 02:10 Aisuko

Get stuck in some issues like below(only in debug mode), it may related to the Rust implement PyTorch C++ API.

dyld[15803]: Library not loaded: @rpath/libtorch_cpu.dylib
  Referenced from: <B583CD33-2743-323A-B503-5781B34C078F> /Users/tifa/Downloads/workspace/LocalAI/backend/rust/target/debug/deps/server-bc3eca19368e3b4a
  Reason: no LC_RPATH's found

It makes hard to debug the program. I am going to refactor some code and add setting file of IDE. Make sure it can be easy for anyone to debug the program.

Oct 31 '23 08:10 Aisuko

it seems to look for libtorch and fails to find it. if you use the ndarray backend does it work?

Oct 31 '23 08:10 lu-zero

it seems to look for libtorch and fails to find it. if you use the ndarray backend does it work?

Will try it and give a feedback

Update

ndarary backend can be used to debug in IDE. And the torch backend has some issues on Mac M1. Here I am trying to set up LIBTORCH_USE_PYTORCH=1 as env with the conda env which is installed PyTorch. However, it is still hit other issues on M1 environment. So, I'm going to use ndarray to help me debug the conversion part code.

Oct 31 '23 23:10 Aisuko

On the M1 probably the wgpu backend is the nicest to use, but ndarray is the one that does not depend on the host system.

Nov 01 '23 08:11 lu-zero

On the M1 probably the wgpu backend is the nicest to use, but ndarray is the one that does not depend on the host system.

Thanks a lot. I have made some change here. I have been migrated the code which is included Llama2 to fork repo, and I am working on the a more simpler model. Here are some reasons:

A simpler model can be more effecient to debug than Llama2, less parameters, and less memory used. (Only load half of Llama2 parameters to tensor can cost at least 13min in my local env now)
We can move faster on this PR. It is good for us to refractor the code, project structure and abstract some common traits.
Easy for code reviewing
Easy for adding some test cases(CI).

Here I hit an issue on reshaping of the Tensor. So, we can try to implement a simple one instead of getting stuck on the Llama2. Screenshot 2023-11-01 at 5 32 21 pm

Nov 01 '23 11:11 Aisuko

Deploy Preview for localai failed.

Name	Link
Latest commit	c9901126900b210fcb73c3d6505d1544405c236f
Latest deploy log	https://app.netlify.com/sites/localai/deploys/655ea7b3d02aec0008ca4cdf

Nov 23 '23 01:11 netlify[bot]

LocalAI
LocalAI copied to clipboard

WIP feat:Init commit for rust backend

Update

Deploy Preview for localai failed.

LocalAI LocalAI copied to clipboard

WIP feat:Init commit for rust backend

Update

❌ Deploy Preview for localai failed.

LocalAI
LocalAI copied to clipboard

Deploy Preview for localai failed.