Víctor Martínez comments

Results 60 comments of


                                            Víctor Martínez

Dejalu is not compliant with Google data privacy requirements

well looks like it no longer works :( tried imap using app password and it does not sync after adding account :)

Chat demo is not working in Chrome 113.

needed to run with `google-chrome-stable --enable-unsafe-webgpu`, the chrome://flags alone did not work

Chat demo is not working in Chrome 113.

@codingjlu just a warning that webgpu is experimental and unsafe and that you are at your own risk

Idea: Fine-tunning loop flow

@ltdrdata great! keeping an eye, will take a chance at implementing the fine-tuning loop when these issues are resolved

Integrate with gpt-index/langchain

@Wingie toolpaca is finetuned around toolformer prompts which are completely different in both syntax and operation from what langchain provides. see https://github.com/lucidrains/toolformer-pytorch/blob/main/toolformer_pytorch/prompts.py vs https://github.com/hwchase17/langchain/blob/master/langchain/agents/conversational/prompt.py the proper way would be to...

LangChain Integration

@oobabooga langchain is essentially a prompt generation and execution framework. allows to do things like re-writting and re-evaluating the conversation history to perform external data ingestion or auto-summarizing the history...

Bump llama.cpp to b1794 and add new cuda lib dep

also includes the newest SOTA 2bit quant support :D

feat: add "unload model" command/endpoint

> https://github.com/ollama/ollama/blob/ecc133d843c8567b27ff3bdc9ff811ecad99281a/docs/faq.md?plain=1#L189 > > use keep_alive param * any negative number which will keep the model loaded in memory (e.g. -1 or "-1m") Does the opposite actually.

feat: add "unload model" command/endpoint

> Hey @knoopx You can actually do this by calling `curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}'` (not with `-1` which will always leave the model loaded). That will immediately...

Vicuna (Fine-tuned LLaMa)

fyi: you can load vicuna model through huggingface transformers by installing it from their git repo. then just load the tokenizer and model via `LlamaTokenizer.from_pretrained(...)` and `LlamaForCausalLM.from_pretrained(...)` and pass them...