Teknium
Teknium
Personally I feel like the datasets we work on here should be limited to self-instruct datasets, i.e. generated by LLM's, just since this is about improving a synthetically generated dataset....
> Thanks for the repo! I was wondering how could I cite your resource in bibtex? I'm not actually sure.
> How and where do i install this? Do i just pop in all in 'text-generation-webui-main' folder, or prompts or what? This is simply a dataset to train models with
hmm, my CPU shouldnt be slow (13700k), but it may not be using everything it needs to, it seems to not be using all cores,  Do I set --affinity...
For reference my inference code: ```from model import ExLlama, ExLlamaCache, ExLlamaConfig from tokenizer import ExLlamaTokenizer from generator import ExLlamaGenerator import os, glob, time # Directory containing model, tokenizer, generator model_directory...
Launching that script with --affinity 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 yields these graphs when inferencing: 
Will --affinity work no matter if the script directly implements something to handle it? I am now getting in-line expected speeds for multigpu 70b inference, about 13.5t/s average - and...
My driver is 31.0.15.3667 (Nvidia 536.67) Will try with benchmark script.
Will update when I downgrade the drivers and do the benchmark script
Updating on benchmark script: Haven't rolled back driver yet ```python .\test_benchmark_inference.py -d C:\Teknium\Models\StableBeluga-7B-GPTQ\ -p -- Tokenizer: C:\Teknium\Models\StableBeluga-7B-GPTQ\tokenizer.model -- Model config: C:\Teknium\Models\StableBeluga-7B-GPTQ\config.json -- Model: C:\Teknium\Models\StableBeluga-7B-GPTQ\gptq_model-4bit-128g.safetensors -- Sequence length: 2048 -- Tuning:...