vectorflow
vectorflow copied to clipboard
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
The hugging face, vdb upload and open ai embeddings workers all need a retry mechanism. The queue system could be leveraged for this, either a general retry queue at each...
VectorFlow is a mono-repo that contains different representations of the same underlying concept. The classes that appear in both in `src/models` and `client/models` have the same fields that need to...
Methods such as `update_batch_and_job_status` are duplicated across the code to prevent different modules from being dependent on each other. We need to 1) find and document all the repetitive code...
Our Docker builds are too slow because they install dependencies every time that rarely change. Refactor our docker images so that all the dependencies have their own build base image...
Add functionality to stream in a whole directory from S3. This will likely require a separate worker to be spun up to stream in the files so it does not...