PittAPI icon indicating copy to clipboard operation
PittAPI copied to clipboard

Scrape categories and topics for `news.py`

Open tianyizheng02 opened this issue 1 year ago • 3 comments

We should scrape [the categories and topics]. We have no idea what the maintenance of this repo will look like over time. It's certainly had its lulls over time, so let's make it withstand the lack of us

Originally posted by @RitwikGupta in https://github.com/pittcsc/PittAPI/pull/203#discussion_r1730415533

In #203, I rewrote news.py to scrape Pitt news articles from the Pittwire website, but I hard-coded the list of news categories and topics. We should scrape these values instead so that we don't have to keep them updated ourselves. Ideally, news.py should only scrape these values once, when the users uses a function from the module for the first time, so that the values are available for all subsequent function calls.

tianyizheng02 avatar Aug 25 '24 22:08 tianyizheng02

Opened an issue for this task in case anyone else wanted to work on it

tianyizheng02 avatar Aug 25 '24 22:08 tianyizheng02

Should this really be on import, or should it be on use?

timparenti avatar Aug 26 '24 22:08 timparenti

Both would technically work, but yeah it'll probably be better if they were imported on use, if for no reason than to make the code easier to test.

tianyizheng02 avatar Aug 27 '24 01:08 tianyizheng02