h2ogpt
h2ogpt copied to clipboard
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
When interacting with h2ogpt I regularly run into a situation where the base model is loaded just to realize a validation fails, e.g. local_files_only but missing a tokenizer, wrong folder...
https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/elasticsearch.html

Notes about Intriguing Properties of Quantization at Scale https://arxiv.org/pdf/2305.19268.pdf So basically should do: * bf16 instead of fp16 training, less sensitive to quantization later * weight decay order 0.05, not...
### 2x A6000Ada 48GB tiiuae/falcon-40b + h2ogpt-fortune2000-personalized PRETRAINING (4-bit) `CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 finetune.py --data_path=h2oai/h2ogpt-fortune2000-personalized --drop_truncations=True --train_4bit=True --base_model=tiiuae/falcon-40b --micro_batch_size=1 --batch_size=128 --num_epochs=3 --run_id=9 --lora_target_modules='["query_key_value", "dense_h_to_4h", "dense_4h_to_h", "dense"]' &> log.9.txt` ` 6%|▌ |...
https://github.com/h2oai/h2ogpt/blob/main/docs/TRITON.md do same for Falcon 7B, then Falcon 40B
https://github.com/h2oai/h2o-wizardlm
https://discord.com/channels/1097462770674438174/1100717863221870643/1113669041240944722 
Question: can the document query model handle the question to ask something about a table's content?
For example, if I have a table of different DL architecture's training and validation accuracy in each row in my **pdf file**, can I ask "what's the validation accuracy of...