Mark Schmidt comments

Results 95 comments of


                                            Mark Schmidt

How to use ggml for Flan-T5

Th colab demo is meant to run on a free google colab GPU not on a local runtime (and definitely not CPU). If you want to run chatGLM on a...

How to use ggml for Flan-T5

I get awful results with Flan-UL2. Its responses tend to be extremely short and it hallucinates more than most models when it doesn't know something. I have had no issues...

How to use ggml for Flan-T5

Here's an example of a question to Flan-UL2 where it is both wrong and characteristically short, even when asked to explain. (Gears 1 and 6 spin in opposite directions, as...

Looking for someone that can help me maintain this list

I've been contributing to most of the listed projects daily for a while and would love to help maintain a list like this. Let me know.

Improve Alpaca integration to match it's trained prompt syntax

Vicuna appears to be trained to use: ``` ### Assistant: Text ### Human: Text ``` Using "### Human:" as a reverse prompt partially works. But instruct mode support could be...

How long did it take?

For comparison Alpaca-7B took 3 hours on 3xA100 and LoRA/PEFT reduces compute requirements two orders of magnitude for similar results. So likely only a couple of hours and also likely...

How long did it take?

@vgoklani Generally you must merge the 16bit peft into a 16bit model and then quantize the resulting merged model down to 4bit if you want 4bit inference. The quality of...

How long did it take?

> > I've tried 7B full fine tune alpaca and a 7b LORA and I find the lora to be greatly lacking But was the LoRA created in 16bit or...

Support using other/local LLMs

@DataBassGit I see that PR got closed. What's the status of your fork?

Support using other/local LLMs

GPT4all supports x64 and every architecture llama.cpp supports, which is every architecture (even non-POSIX, and webassemly). Their moto is "Can it run ~Doom~ LLaMA" for a reason. Ooga supports GPT4all...