cleanvision
cleanvision copied to clipboard
Automatically find issues in image datasets and practice data-centric computer vision.
Possible places to update - CI pipeline - pyproject package dependencies - cleanvision.__version__
Pre-commit hooks could be optimized (e.g. using ruff instead of flake8 and black)
`imagelab.visualize()`, picks random images from the dataset and visualizes them in a grid. Existing tests : https://github.com/cleanlab/cleanvision/blob/main/tests/test_visualize.py Things to test - Test imagelab.visualize(issue_types) works as intended for all 3 types...
Goal: create a static page in our [docs](https://cleanvision.readthedocs.io/en/latest/) that shows a continuously growing list of images corresponding to different types of issues detected by CleanVision. The page should be split...
`imagelab.info` is an `Imagelab` class attribute designed to be a nested dictionary containing information relevant to the computation of issue types. Right now the type assigned to imagelab.info is `Dict[str,...
Try to speed up the runtime of this library on large datasets. This can be done via: - [ ] speeding up individual checks - [ ] reusing more computation...
The cards should summarize the issues in a dataset like from the CleanVision blogpost: https://cleanlab.ai/blog/cleanvision/ Here's an example of a desired output from Caltech-256:  ``` base_image_gray = cv2.cvtColor(np.array(base_image), cv2.COLOR_RGB2GRAY) near_duplicate_img_gray = cv2.cvtColor(np.array(near_duplicate_img), cv2.COLOR_RGB2GRAY) diff_map = cv2.absdiff(base_image_gray, near_duplicate_img_gray) ``` Also, need...