lilac
lilac copied to clipboard
Curate better data for LLMs
## Description: Currently, the Lilac lacks the capability to upload files directly from a local drive when it's hosted within a Docker container on a server. This feature is essential...
After starting the clustering I get this error: ``` [local/evol1][1 shards] map "extract_text" to "('prompt__cluster',)": 100%|████████████████████████████████████████████████████████████████████████████████| 319/319 [00:00
@HalfdanJ Hi, could you please give some advice for this issue? When managing the RLHF data, we hope to be able to create and edit data directly in Lilac. For...
I don't know why it wants to use Jina, I started with gte-small as my preferred, and then in the UI (after initial error) changed to sbert, but got this...
I'm attempting to cluster ~600k short texts (reviews). The process goes ok up until it logs that it's assigning noise points to clusters. It spends close to an hour embedding...
Greetings, I am trying to speed up my project by transitioning over from my long list (~200,000) of JSON files to the .parquet file that is created from this project...
I was working separately on a patch for nomic-embed-text and noticed the model card says it "Requires Pre-fixes" [https://huggingface.co/nomic-ai/nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5). Didn't see this specifically implemented in the [Nomic PR](https://github.com/lilacai/lilac/pull/1182). Is this...