harwest-tool
harwest-tool copied to clipboard
[Feature Request] Crawl submissions in gym & virtual contest
These submissions require login. Using requests.session
, login should be possible. I've hacked around and this login method works:
def __login(self):
username = 'I_love_Hoang_Yen'
password = '<redacted>'
bfaa = 'f1b3f18c715565b589b7823cda7448ce'
ftaa = ''.join(random.choices('abcdefghijklmnopqrstuvwxyz0123456789', k=18))
LOGIN_URL = 'https://codeforces.com/enter'
r = self.session.get(LOGIN_URL)
csrf = r.text.split("csrf_token' value='")[1].split("'")[0]
data = {
"csrf_token": csrf,
"action": "enter",
"ftaa": ftaa,
"bfaa": bfaa,
"handleOrEmail": username,
"password": password,
"_tta": "176",
"remember": "on",
}
r = self.session.post(LOGIN_URL, data=data, headers={'X-Csrf-Token': csrf})
After that it's also necessary to modify submission URL (for contest ID > 100k, should be /gym/{contest_id}/submission/{submission_id}
.
This is simply awesome @ngthanhtrung23! Thanks for providing the starting points, will try to integrate this in the crawling flow.