Nikhil Thorat

Results 53 issues of Nikhil Thorat

This issue will start the conversation around how we will version source (or map output) data so that it feels seemless to make edits and scrub backwards in time.

We've gotten this request a few times, let's build something.

This is hard to read right now

From @dechantoine "+ it could be cool to have somewhere to edit the prompt for positive samples generation "

Right now media fields get flattened. Metadata fields are hierarchical, we should do the same for media.

We should have a way to directly export back to a HuggingFace dataset. In that flow, you could see lilac as a more complex `HuggingFaceDataset.map()`.

See: https://developer.nvidia.com/blog/gpu-accelerated-hierarchical-dbscan-with-rapids-cuml-lets-get-back-to-the-future/ This requires some fancy pip installation stuff I couldn't get working in 5 minutes. We should think about deep-diving here as the GPU-accelerated version is *much* faster