Open-Assistant Cleanup repo for inference

Cleanup repo for inference

Open olliestanley opened this issue 2 years ago • 0 comments

It is not clear to me whether we still need the "LLaMA worker" Dockerfile.

I am also not sure if we are still using or will use the text-generation-inference worker variant as my understanding was we now use the "basic HF server" variant of the worker in production.

Maybe we could clarify these things and do some cleanup of the inference section of the repo to reflect.

This goes hand in hand with #1473 which I think needs to be prioritised now that the inference pipeline is fairly stable and not changing regularly. We have many users asking for details on how to run their own instances.

Apr 30 '23 16:04 olliestanley

Open-Assistant Open-Assistant copied to clipboard

Cleanup repo for inference

Open-Assistant
Open-Assistant copied to clipboard