wildcard
wildcard copied to clipboard
Add EECS course catalog adapter
site adapter for https://eecs.scripts.mit.edu/eduportal/who_is_teaching_what/F/2020/
- remove mode column to make compatible with previous semesters
site adapter for https://firehose.guide/
-
in the future, would like to parse full.js / javascript variable 'classes' to scrape full list of classes/class info rather than what's visible on the page
-
would like to add annotations feature to both adapters in the future
@gzlin7 the firehose one looks great!
The EECS catalog one gives me this error:
wildcard.js:169206 Uncaught TypeError: Cannot read property 'innerText' of undefined
at wildcard.js:169206
at Array.map (<anonymous>)
at Object.scrapePage (wildcard.js:169194)
at scrapePage (wildcard.js:151319)
at loadTable (wildcard.js:151331)
at wildcard.js:151350
at onDomReady (wildcard.js:151287)
at Object.initialize (wildcard.js:151349)
at run$1 (wildcard.js:170442)
at wildcard.js:170487
Can you see if you can reproduce that? I'm guessing the website has changed since you wrote the adapter.
If that's the case, beyond just a one-off fix, would be interesting to consider:
- what strategies could have avoided the error? (could the wildcard scraping lib provide helpful utilities? could we wrap more stuff in try/catch blocks so that a single bit of missing data doesn't blow up the whole scrape?)
- how could we report a better error here? I'm guessing we want an error message like "couldn't find __ attribute for row ___"?