byt5-geotagging
byt5-geotagging copied to clipboard
`[Challenge]` Top regions
Challenge 1
This competition takes on the goal to improve upon Yachay.ai's infrastructure to train a deep learning model to predicts coordinates (latitude, longitude) of individual texts.
The first suggested methodology on training the model is to look into the annotated data set on texts posted from 123 populated regions around the planet.
No hard MSE or EER requirements, we're looking for innovative ideas for the infrastructure development.
The dataset provided is an:
- annotated corpus of 500k texts, as well as the respective geocoordinates
- 123 regions covered
- 5000 tweets per location
The data set is here
Deliverable
- A model which takes a text on the input and returns the coordinates on the output
- Evaluation metrics obtained on the development dataset, including Mean Absolute Error in kilometers.
We will evaluate the model using the test dataset that is not shared here.
Additional notes
Contact us at [email protected] for any questions or additional requests.
is this challenge still open and currently looking for contributions???? @alinapark @ingakaspar
@AnuravModak yes! All challenges are open :) If you have questions/ideas/suggestions, do not hesitate to start a discussion in our discord
Is there any user guide available explaining how to implement and use the model.. something like this click here,( i have attached the example of user-guide for knn from scikit-learn.) @ingakaspar