Scribe-Data icon indicating copy to clipboard operation
Scribe-Data copied to clipboard

Expand Kurmanji data queries

Open andrewtavis opened this issue 1 year ago • 8 comments

Terms

Description

This issue would look into expanding the src/scribe_data/language_data_extraction/Kurmanji files with as much data as are possible from the current data on Wikidata. We can use code for getting data from other languages, and from there we can check Kurmanji data on Wikidata for what conjugations are available. We can then expand the query with optional selections of certain forms as is done in other SPARQL queries. The query can be tried on the Wikidata Query Service UI during development :)

Data types to include:

  • [x] Nouns
  • [x] Verbs
  • [x] Adjectives
  • [x] Adverbs
  • [x] Prepositions
  • [ ] Emoji keywords

Contribution

Happy to support the development and review when the PR is up 😊

andrewtavis avatar Oct 03 '24 22:10 andrewtavis

@andrewtavis I will work on this!

Khushalsarode avatar Oct 06 '24 23:10 Khushalsarode

Thanks for picking it up, @Khushalsarode! Let us know if you need any support :)

andrewtavis avatar Oct 07 '24 01:10 andrewtavis

@andrewtavis would you like me to work on Marathi entity verbs/prepositions on wikidata?

Khushalsarode avatar Oct 07 '24 22:10 Khushalsarode

That's kind of up to you, @Khushalsarode :) If you'd like to expand them, then that would be really cool! I'm not sure if we can count Wikidata edits for Outreachy or Hacktoberfest, but if you're interested in editing a bit, then by all means!

andrewtavis avatar Oct 07 '24 22:10 andrewtavis

Just added a list of data types that we want to include to this issue :) Have marked those that are already done or have PRs open, and we can work on the others 😊 If the data type can't work, then we can move to the others and open up specific issues later :)

andrewtavis avatar Oct 09 '24 08:10 andrewtavis

Let's check the other data types above to see if there is available data, @Khushalsarode :) Thanks for your efforts here!

andrewtavis avatar Oct 09 '24 21:10 andrewtavis

@andrewtavis already on it! I think this language data entities arequite less in amount.

Khushalsarode avatar Oct 09 '24 22:10 Khushalsarode

Yes sadly, but we can still map out the queries for what's there as the data will be there eventually :)

andrewtavis avatar Oct 09 '24 23:10 andrewtavis

Closing this up @Khushalsarode as the plan in #379 is to centralize the emoji data needed. Sending a commit along with the __init__.py for it :)

andrewtavis avatar Oct 16 '24 12:10 andrewtavis