lilac
lilac copied to clipboard
Curate better data for LLMs
This is a request from a user!
https://www.loom.com/share/eb6042214d1e45a1b6ca869586762722 probably because we mistake the null-ness for"not loaded yet" rather than "this is now null where it was previously not-null"
It seems like people can't find this!
Feedback from Sean: I’m importing a dir full of parquet files, and I pass the path of the dir containing all those files, I get a “no files matching” error....
In the short term, we should have a single global db connection that can make unique view names for each dataset, and read multiple database files using `ATTACH`: https://duckdb.org/docs/sql/statements/attach.html This...
Some errors, like https://www.pantz.org/software/sqlite/unabletoopendbsqliteerror don't manifest until a user tries to label for the first time.
For example: https://lilacai-lilac.hf.space/datasets/#lilac/lmsys-chat-1m&viewPivot=true&pivot=%7B%22outerPath%22%3A%5B%22conversation__clusters%22%2C%22category_title%22%5D%2C%22innerPath%22%3A%5B%22conversation__clusters%22%2C%22cluster_title%22%5D%7D
Now that we've propagated all the metadata for dataset-format specific inputs, we should show the string in the UI.