cleanlab-studio icon indicating copy to clipboard operation
cleanlab-studio copied to clipboard

Client interface for all things Cleanlab Studio

Results 6 cleanlab-studio issues
Sort by recently updated
recently updated
newest added

Add `Studio.delete_dataset` method to API Backend endpoint is live in `dev` and `staging` Backend PR: https://github.com/cleanlab/cleanlab-studio-backend/pull/1523

Skeleton code for improved Auto-Fix strategies ``` from cleanlab_studio import Studio API_KEY = os.environ['CLEANLAB_API_KEY'] studio = Studio(API_KEY) df = pd.DataFrame(...) dataset_id = studio.upload_dataset(df) project_id = studio.create_project(dataset_id=dataset_id, ...) cleanset_id = studio.get_latest_cleanset_id(project_id)...

as title, so we don't have to keep things updated with stable version

In the original DataFrame upload implementation for Snowflake and PySpark DataFrames, the DataFrames are loaded entire into memory first before being uploaded ([PySpark](https://github.com/cleanlab/cleanlab-studio/blob/e2736b0599e3c0ee3960d841236d5537e156497f/cleanlab_studio/internal/dataset_source/pyspark_dataset_source.py#L19), [Snowflake](https://github.com/cleanlab/cleanlab-studio/blob/e2736b0599e3c0ee3960d841236d5537e156497f/cleanlab_studio/internal/dataset_source/snowpark_dataset_source.py#L19C9-L19C45)). However, this can cause problems if...

Adding support for importing image datasets stored from Snowflake stages and DBFS following [simple layout](https://help.cleanlab.ai/guide/concepts/datasets/#simple-zip) and [metadata layout](https://help.cleanlab.ai/guide/concepts/datasets/#metadata-zip).

Split each batch into sub-batches so that users don't have to deal with timeout problems when using inference API. See this notion doc: https://www.notion.so/cleanlab/Model-Inference-Python-API-d5209e022f744298835010dbbfb26a4c 10,000 for text and 100,000 seems...