wildcard icon indicating copy to clipboard operation
wildcard copied to clipboard

Add EECS course catalog adapter

Open gzlin7 opened this issue 4 years ago • 1 comments

site adapter for https://eecs.scripts.mit.edu/eduportal/who_is_teaching_what/F/2020/

  • remove mode column to make compatible with previous semesters

site adapter for https://firehose.guide/

  • in the future, would like to parse full.js / javascript variable 'classes' to scrape full list of classes/class info rather than what's visible on the page

  • would like to add annotations feature to both adapters in the future

gzlin7 avatar Aug 30 '20 17:08 gzlin7

@gzlin7 the firehose one looks great!

The EECS catalog one gives me this error:

wildcard.js:169206 Uncaught TypeError: Cannot read property 'innerText' of undefined
    at wildcard.js:169206
    at Array.map (<anonymous>)
    at Object.scrapePage (wildcard.js:169194)
    at scrapePage (wildcard.js:151319)
    at loadTable (wildcard.js:151331)
    at wildcard.js:151350
    at onDomReady (wildcard.js:151287)
    at Object.initialize (wildcard.js:151349)
    at run$1 (wildcard.js:170442)
    at wildcard.js:170487

Can you see if you can reproduce that? I'm guessing the website has changed since you wrote the adapter.

If that's the case, beyond just a one-off fix, would be interesting to consider:

  • what strategies could have avoided the error? (could the wildcard scraping lib provide helpful utilities? could we wrap more stuff in try/catch blocks so that a single bit of missing data doesn't blow up the whole scrape?)
  • how could we report a better error here? I'm guessing we want an error message like "couldn't find __ attribute for row ___"?

geoffreylitt avatar Sep 10 '20 16:09 geoffreylitt