weasel icon indicating copy to clipboard operation
weasel copied to clipboard

Request for Labeling Functions

Open yongzx opened this issue 4 years ago • 4 comments

Thanks for open-sourcing the code! Quick question: where can I find the additional labeling functions for Amazon, IMDB (136 LFs), and BiasBios (99 LFs)?

yongzx avatar Dec 03 '21 16:12 yongzx

Hi :), you can switch to the 'research_code' branch. There you'll find all that info -- most relevant: "The smallest dataset, Bias in Bios is already included in the data subdirectory. All the rest can be downloaded from this Google drive link. Please put the downloaded data into the data/ directory."

salvaRC avatar Dec 03 '21 22:12 salvaRC

where the gdrive link is https://drive.google.com/drive/folders/1v7IzA3Ab5zDEsRpLBWmJnXo5841tSOlh?usp=sharing

salvaRC avatar Dec 03 '21 22:12 salvaRC

Hi @salvaRC, thanks for the prompt reply! Quick follow-up on the Drive: I couldn't find the labeling function codes other than the existing label votes. Am I missing something?

yongzx avatar Dec 04 '21 03:12 yongzx

Ahh, now I understand -- good point @yongzx ! We hadn't uploaded any LF definitions, indeed.

In the research_code branch I just uploaded in the data subdir the files lfs_to_use_.txt which contain all the used LF definitions (simple keyword detectors). The instructions for arriving at the label matrices that we uploaded are described in the updates to the following readme: https://github.com/autonlab/weasel/blob/research_code/data/README.md

I hope that helps & let me know if anything is unclear :)

salvaRC avatar Dec 07 '21 04:12 salvaRC