cookbook
cookbook copied to clipboard
Minor Issues with Colab Notebook in 'Annotate text data using Active Learning with Cleanlab'
In Annotate text data using Active Learning with Cleanlab, there are a few minor issues that prevent the Colab notebook from running correctly:
-
The datasets package version generates an error due to the use of Pandas >2.0.0. This issue is already addressed here.
-
When trying to download the dataset using the
load_datasetfunction, it generates an error. This same error can also be seen here. The error message is:All the data files must have the same columns, but at some point there are 33 missing columns ({'a93', 'a16', 'a6', 'a19', 'a98', 'a120', 'a100', 'a61', 'a215', 'a52', 'a216', 'a157', 'a196', 'a65', 'a12', 'a193', 'a99', 'a39', 'a20', 'a60', 'a131', 'a22', 'a42', 'a68', 'a185', 'a151', 'a70', 'a102', 'a127', 'a158', 'a180', 'a197', 'a178'}). -
There are a couple of links that are not pointing to the correct download locations in the final part of the notebook. This prevents the correct file from being downloaded:
!wget -nc -O 'random_acc.npy' 'https://huggingface.co/datasets/Cleanlab/stanford-politeness/blob/main/activelearn_acc.npy?raw=true'