Aaron Miller comments

Results 24 comments of


Aaron Miller

Replit Model

ah nuts - alright, I'm fetching the old version to test - I do think it'd be good to get a code model sooner than later but I also think...

redo: New tokenizer implementation for MPT and GPT-J

this got a bit hairy after the multiple implementation split without causing multiple embedded copies of the tokenizer configs, but should be a bit more doable as of the `prompt()`...

redo: New tokenizer implementation for MPT and GPT-J

> This can be closed since the tokenizer changes upstream? no there's still no upstream fix for this - it requires file format changes so its not likely happening upstream...

ggml : unified file format

If a major file format change is going to happen again the tokenizer configs for the models using huggingface `tokenizers` BPE/GPT-2-like tokenizers ought to be improved (i.e. all but the...

Crashing to blank screen upon scanning Panoramas

I'm also running into this with decently sized images - given a folder with a hefty enough amount of 5K-8K jpegs I can reliably *crash* the renderer process if I...

mmap() - backed istream implementation

My comment about `MADV_SEQUENTIAL` was assuming you were trying to implement zero-copy approach and have the inference-time code directly use the model from the mapping rather than still copying everything...

add falcon7b example

> Cool - will take a look soon Is this using actual MQA or it's still doing the trick with the copies? does copies with `ggml_repeat` presently - also wound...

add falcon7b example

> Are you working on a 40B branch already ? I'm not presently - being that it's *big* its a bit more inconvenient to hack on as I'd need to...

add falcon7b example

> > > Are you working on a 40B branch already ? > > > > > > I'm not presently - being that it's _big_ its a bit more...

Output on Metal is silently corrupted when out of memory

for me its exactly 1718 ---- but I just realized I can get the same behavior with a q4_0 model if I bump it to 2570 (maybe less, didn't narrow...