data_tooling icon indicating copy to clipboard operation
data_tooling copied to clipboard

Create dataset offenseval_dravidian

Open albertvillanova opened this issue 4 years ago • 0 comments

  • uid: offenseval_dravidian
  • type: primary
  • description:
    • name: Offenseval Dravidian
    • description: Gold standard corpus for offensive language identification of code-mixed text in Dravidian languages (Tamil-English, Malayalam-English, and Kannada-English).
    • homepage: https://huggingface.co/datasets/offenseval_dravidian
    • validated: True
  • languages:
    • language_names:
      • Indic
      • Tamil
      • Kannada
      • Malayalam
    • language_comments:
    • language_locations:
      • Southern Asia
      • India
    • validated: False
  • custodian:
    • name: Bharathi Raja Asoka Chakravarthi (
    • in_catalogue:
    • type: A private individual
    • location: Ireland
    • contact_name: Bharathi Raja Asoka Chakravarthi
    • contact_email: [email protected]
    • contact_submitter: False
    • additional: https://github.com/bharathichezhiyan
    • validated: False
  • availability:
    • procurement:
      • for_download: Yes - it has a direct download link or links
      • download_url: https://zenodo.org/record/4750858#.YZZArr1BydZ , https://huggingface.co/datasets/offenseval_dravidian
      • download_email:
    • licensing:
      • has_licenses: Yes
      • license_text:
      • license_properties:
        • open license
      • license_list:
        • cc-by-4.0: Creative Commons Attribution 4.0 International
    • pii:
      • has_pii: Yes
      • generic_pii_likely: somewhat likely
      • generic_pii_list:
        • dates (birth, death, etc.)
      • numeric_pii_likely: none
      • numeric_pii_list:
      • sensitive_pii_likely: somewhat likely
      • sensitive_pii_list:
        • racial or ethnic origin
        • religious or philosophical beliefs
        • political opinions
        • data concerning a person's sex life or sexual orientation
      • no_pii_justification_class:
      • no_pii_justification_text:
    • validated: False
  • source_category:
    • category_type: website
    • category_web: social media
    • category_media:
    • validated: False
  • media:
    • category:
      • text
    • text_format:
      • .CSV
    • audiovisual_format:
    • image_format:
    • database_format:
      • .CSV
    • text_is_transcribed: No
    • instance_type: Sentence and its offensive index
    • instance_count: 10K<n<100K
    • instance_size: 10<n<100
    • validated: False
  • fname: offenseval_dravidian.json

albertvillanova avatar Jan 19 '22 08:01 albertvillanova