Robbe Sneyders
Robbe Sneyders
By @PhilippeMoussalli: Conclusions from https://github.com/ml6team/fondant/pull/489 Need to find a way to scale across GPUs, possible options: * Multiple GPUs can be loaded for inference using pytorch [Data Parallelism](https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html) (this does...
When running a pipeline with a step that is missing a required argument, an error is only raised at runtime. We should be able to validate the arguments at compilation...
The image gallery view of the data explorer fails when there are no images in the dataset. > TypeError: load_pandas_dataframe() missing 1 required keyword-only argument: 'selected_fields' > Traceback: > File...
When the user uses Lightweight Python components (https://github.com/ml6team/fondant/issues/558) we want to get any information we currently get from the component spec from the provided Python code. For the `consumes` section,...
Branching pipelines are not yet supported, but we're not longer properly validating the pipeline for branches. When applying multiple operations to the same dataset with the docker runner, the following...
Currently users who try to pull images on their M1 silicon (eg. using the local runner) will run into the following issue: ``` no matching manifest for linux/arm64/v8 in the...
We currently only support non-linear DAGs, which limits the expressiveness of Fondant pipelines. To support non-linear DAGs, we would need at least the following functionality: - Compiling to non-linear DAGs...
To make it easier to create custom components which are too complex to be a lightweight Python component (#558), we could create a cookie-cutter, which generates a component template and...
By @philippe-ml6: Implementing GPU components can be a bit tricky since we need to make sure that both preprocessing and inference are batched, would be nice if we can implement...
Fondant currently supports execution across cores on a single machine, which enables workflows using millions of images, but becomes a bottleneck when scaling to tens or hundreds of millions of...