data_tooling
data_tooling copied to clipboard
Create dataset offenseval_dravidian
- uid: offenseval_dravidian
- type: primary
- description:
- name: Offenseval Dravidian
- description: Gold standard corpus for offensive language identification of code-mixed text in Dravidian languages (Tamil-English, Malayalam-English, and Kannada-English).
- homepage: https://huggingface.co/datasets/offenseval_dravidian
- validated: True
- languages:
- language_names:
- Indic
- Tamil
- Kannada
- Malayalam
- language_comments:
- language_locations:
- Southern Asia
- India
- validated: False
- language_names:
- custodian:
- name: Bharathi Raja Asoka Chakravarthi (
- in_catalogue:
- type: A private individual
- location: Ireland
- contact_name: Bharathi Raja Asoka Chakravarthi
- contact_email: [email protected]
- contact_submitter: False
- additional: https://github.com/bharathichezhiyan
- validated: False
- availability:
- procurement:
- for_download: Yes - it has a direct download link or links
- download_url: https://zenodo.org/record/4750858#.YZZArr1BydZ , https://huggingface.co/datasets/offenseval_dravidian
- download_email:
- licensing:
- has_licenses: Yes
- license_text:
- license_properties:
- open license
- license_list:
- cc-by-4.0: Creative Commons Attribution 4.0 International
- pii:
- has_pii: Yes
- generic_pii_likely: somewhat likely
- generic_pii_list:
- dates (birth, death, etc.)
- numeric_pii_likely: none
- numeric_pii_list:
- sensitive_pii_likely: somewhat likely
- sensitive_pii_list:
- racial or ethnic origin
- religious or philosophical beliefs
- political opinions
- data concerning a person's sex life or sexual orientation
- no_pii_justification_class:
- no_pii_justification_text:
- validated: False
- procurement:
- source_category:
- category_type: website
- category_web: social media
- category_media:
- validated: False
- media:
- category:
- text
- text_format:
- .CSV
- audiovisual_format:
- image_format:
- database_format:
- .CSV
- text_is_transcribed: No
- instance_type: Sentence and its offensive index
- instance_count: 10K<n<100K
- instance_size: 10<n<100
- validated: False
- category:
- fname: offenseval_dravidian.json