slurp icon indicating copy to clipboard operation
slurp copied to clipboard

🔪 The Khan scrapers are broken 🔪

Open jbampton opened this issue 6 years ago • 8 comments

Looks like all the khan scrapers are broken :(

jbampton avatar Dec 09 '19 17:12 jbampton

Hello @jbampton. Is this the script that we should fix: https://github.com/slurpcode/slurp/blob/master/scrapers/python/lxml/khan.py?

mirelagrigoras avatar Aug 29 '20 14:08 mirelagrigoras

Hi @mirelagrigoras !!

Yes that is the khan Python script that needs fixing.

But my Khan profile link is now -> https://www.khanacademy.org/profile/JohnBampton/

jbampton avatar Aug 29 '20 18:08 jbampton

I think the way the Khan pages are rendered has changed.

Seems they might be using dynamic JS to create the web pages.

jbampton avatar Aug 29 '20 18:08 jbampton

We need the Energy points earned data scraped.

Screen Shot 2020-08-30 at 11 26 51 pm

jbampton avatar Aug 30 '20 13:08 jbampton

Yes, the way pages are rendered has changed. Whoever wants to fix other scrapers can check out the PHP scrapper, it works with the new layout and data fetching.

ajakov avatar Jun 26 '21 10:06 ajakov

Done in Python now

jbampton avatar Apr 21 '22 16:04 jbampton

does it still needs fixing , really want to do some web scraping , if so please add me

wickedknock avatar Apr 27 '23 18:04 wickedknock

Yes they do need fixing

jbampton avatar Apr 30 '23 08:04 jbampton