apify-docs icon indicating copy to clipboard operation
apify-docs copied to clipboard

Teach making requests in a consistent way

Open honzajavorek opened this issue 1 year ago • 4 comments

The courses now recommend using variety of tools to make HTTP requests. Sometimes it's more confusing, sometimes less.

  • The got library seems to be superseeded by ky, at least in the README Sindre mentions it.
  • Apify develops got-scraping, which seems to be married to the got library. Wouldn't it make sense to turn got-scraping into something more agnostic to a client library? Could it just prepare the request details, so that they can be attached to the request by any library?
  • Some guides use axios
  • Some guides mention request and request-promise, which are now both deprecated
  • Meanwhile, Node.js has adopted fetch to the stdlib

Using got-scraping in the basic tutorial is probably unnecessary, any HTTP client can be used in the initial lessons. The value of got-scraping should emerge with more complicated use cases.

But wouldn't it make more sense to skip got-scraping and promote Crawlee right away at that point? Is got-scraping something Apify wants to spend marketing energy on, or is it an implementation detail?

As of now, got-scraping doesn't have good Python alternatives I'd know about. There are independent libraries one can use, such as fake-user-agent, which have integrations with scraping frameworks.

Regarding request libraries, the scene is similarly shattered in Python, featuring requests, aiohttp, or httpx, each having their fans and use cases.

I'd like to kick off this as a discussion on what should be the preferred way for the Academy to teach making requests in 2024, using Node.js and Python.

honzajavorek avatar Apr 26 '24 08:04 honzajavorek

@B4nan What HTTP client do you use in Python Crawlee under the hood - requests or httpx? Or is it built on top of aiohttp?

honzajavorek avatar May 20 '24 13:05 honzajavorek

cc @vdusek

B4nan avatar May 20 '24 13:05 B4nan

we use httpx

vdusek avatar May 20 '24 13:05 vdusek

Cool, thanks! I thought about teaching making HTTP requests using httpx, because it's very similar to widespread requests, but also has async and in general, it has future 😅

On Mon 20. 5. 2024 at 15:14, Vlada Dusek @.***> wrote:

we use httpx

— Reply to this email directly, view it on GitHub https://github.com/apify/apify-docs/issues/950#issuecomment-2120439495, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACFGMN56PLGVTHUXJDDMUDZDHZLZAVCNFSM6AAAAABG2MOLQOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRQGQZTSNBZGU . You are receiving this because you were assigned.Message ID: @.***>

honzajavorek avatar May 20 '24 15:05 honzajavorek