uoft-scrapers icon indicating copy to clipboard operation
uoft-scrapers copied to clipboard

UofT Drop-in sports schedules

Open qasim opened this issue 8 years ago • 2 comments
trafficstars

The drop-in sports schedules at UofT SG seems more structured now:

https://kpe.utoronto.ca/sports-and-rec

There are still some differences between sports, but all seem scrape-able. We should take advantage of this.

qasim avatar Jan 10 '17 17:01 qasim

Looks like they're loading raw HTML after page load -

jQuery(function($){
  $('#dropinschedule').load('https://class-api.kpe.utoronto.ca:8443/times.php?id_list=6,85,181,182,90,342,675,677&dataonly=true&showcoedcol=true&sport=basketball');
});

I think we can parse the URLs from here and then scrape the HTML from each URL.

The only other approach (as far as I can tell) would be to form a list of all possible id_list values and all possible sport values and then use those (id_list values map to buildings/locations, but not the same ones from the buildings dataset 🙃).

Also, looks like they're only providing data for a week at a time? I think this means that we can't merge this dataset with athletics. Schema can probably remain the same though (minus building_id).

kashav avatar Jan 10 '17 20:01 kashav

Wow, can't say I'm surprised of the inconsistent building IDs 🙃

We could also limit athletics to just the current week, maybe. Perhaps that's trying too hard to accommodate for this and we should have another endpoint.

qasim avatar Jan 10 '17 20:01 qasim