GSoCOrgFrequency icon indicating copy to clipboard operation
GSoCOrgFrequency copied to clipboard

Implement parsing of the new GSoC Server Side rendered website

Open thealphadollar opened this issue 2 years ago • 2 comments

The GSoC archives website has been set to server side rendering - this is not allowing us to parse the webpage as it was possible previously.

Alongside the same, the class names have changed. However, they have become simpler on a card basis. The main issue to solve is rendering the server side rendered pages completely before passing them to BeautifulSoup.

thealphadollar avatar Oct 01 '22 08:10 thealphadollar

Hello @thealphadollar, I was exploring the project and found this can be done using a headless browser like in Selenium or will have to use some other library/framework with built-in support for rendering. It is not possible in bs4. Shall I try with Selenium?

nilesh05apr avatar Jun 08 '23 12:06 nilesh05apr

Sure, try it. Let me know if you get stuck anywhere.

thealphadollar avatar Jun 08 '23 16:06 thealphadollar