iwr-redmond

Results 7 comments of iwr-redmond

As a workaround, use Huggingface: `imagine --model-weights-path https://huggingface.co/XpucT/Deliberate/resolve/main/Deliberate_v6.safetensors --model-architecture sd15 "a flower" ` This downloads the model, converts it to diffusers, and then you can generate normally. You can permanently...

Hello As open models are already supported in the cloud via Deep-Infra, all that is needed is a bit of new code in edsl/inference_services/llama-cpp.py. This could be: 1. A hack:...

Hello I have taken note of some community-developed roleplay models that you folks may wish to test against once you have a working inference script: - turboderp/llama3-turbcat-instruct-8b - ClosedCharacter/Peach-9B-8k-Roleplay -...

I don't think that's correct. In llama-cpp-python, one uses a GGUF quantized model and can easily call on a local CPU or GPU to start inference. Huggingface documentation typically refers...

From Appendix A (p17) of the original paper (emphasis added): > We **downloaded the publicly available language model weights** from their respective official HuggingFace repositories. We run the models in...

Ah, but that is not correct. To go through the options: 1. Hack: the llama-cpp-python package can spin up its own [OpenAI-compatible server](https://llama-cpp-python.readthedocs.io/en/latest/#openai-compatible-web-server), so just copy your deepinfra template for...

This starter code should be good enough for government work: ``` EDSL_LOCAL_DEFAULTS = ( "# configuration file default" ) try: from llama_cpp import server.app env_custom_models = os.environ.get("EDSL_LOCAL_CONFIG", EDSL_LOCAL_DEFAULTS) try: create_app(config_file=env_custom_models,...