OgbujiPT
OgbujiPT copied to clipboard
Client-side toolkit for using large language models, including where self-hosted
It's been annoying me for a while that each of the metadata fields for our vector DB are a bit different: The ones in `pgvector_data_doc` use: ```sql tags TEXT[] --...
* #78 * #79
There are some situations where having lots of chunk overlap isn't super useful; being able to set it as 0 (or, perhaps, just not set it, and have it assume...
When doing low-level finetuning (without the aid of HF's [SFTtrainer](https://huggingface.co/docs/trl/sft_trainer) library, for example), you may need to be able to tokenize a string with the model's prompting format but without...
Some systems (such as [mlx](https://github.com/ml-explore/mlx)) don't yet work with HF's safetensors ( see [LLama example doesn't work with HF format models? #65](https://github.com/ml-explore/mlx-examples/issues/65) ) and require access to the PyTorch files...
I originally got this error in my code while iterating over the results gathered asynchronously from a call to oapi.wrap_for_multiproc(prompt_to_chat('.. prompt ..'), **model_params). I was able to successfully reproduce it...
```python concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/Users/osi/.local/venv/atc/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/Users/osi/.local/venv/atc/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request six.raise_from(e, None) File "", line 3,...
Right now, prompts get somewhat malformed, missing several newlines. In a commit that's coming soon:tm:, i'll be hardcoding the templates in `pylib/prompting/model_style.py`, but obviously a solution that can programmatically add...
Expand the test suite, reaching for 100% coverage. A notable gap in coverage is `async_helper.py`.
The new chunkers, post #30, are now generators, but that does us no good with Qdrant helper `update()` requiring a fixed sequence. Also consider the PG helper `insert_many()` as well,...