Sanjana
Sanjana
Update docs of cleanlab to include a table that summarizes all types of issues detected by Datalab
It could include the following columns issue type | short desc | supported modalities | supported tasks | required args in find_issues Right now, this information is not easily visible
lab.find_issues(features=features) output ``` [/Users/sanjana/cleanlab_home/fork_cleanlab/cleanlab/datalab/internal/issue_finder.py:457](https://file+.vscode-resource.vscode-cdn.net/Users/sanjana/cleanlab_home/fork_cleanlab/cleanlab/datalab/internal/issue_finder.py:457): UserWarning: No labels were provided. The 'label' issue type will not be run. warnings.warn("No labels were provided. " "The 'label' issue type will not be run.")...
Error ```Finding null issues ... Error in null: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to...
Possible places to update - CI pipeline - pyproject package dependencies - cleanvision.__version__
`imagelab.visualize()`, picks random images from the dataset and visualizes them in a grid. Existing tests : https://github.com/cleanlab/cleanvision/blob/main/tests/test_visualize.py Things to test - Test imagelab.visualize(issue_types) works as intended for all 3 types...
`imagelab.info` is an `Imagelab` class attribute designed to be a nested dictionary containing information relevant to the computation of issue types. Right now the type assigned to imagelab.info is `Dict[str,...
The diff image should look like this  ``` base_image_gray = cv2.cvtColor(np.array(base_image), cv2.COLOR_RGB2GRAY) near_duplicate_img_gray = cv2.cvtColor(np.array(near_duplicate_img), cv2.COLOR_RGB2GRAY) diff_map = cv2.absdiff(base_image_gray, near_duplicate_img_gray) ``` Also, need...
Add unit tests to ensure we can: - separately search for exact-duplicates without searching for near-duplicates - separately search for near-duplicates without searching for exact-duplicates - search for both in...