Implement fingerprinting
Coordinate with @barjin before implementing anything.
There is a possibility of developing a dedicated fingerprinting library (in Rust?). In that case, we will do just some wrapping in Python tooling (same in JavaScript).
Just to make things clear - there are two different initiatives regarding the "stealth" scraping:
fingerprint-suite(github): a bunch of libraries for generating real-life HTTP header sets (making sure theuser-agentmatches theosetc.) and injecting those in browsers / http clients.
This works already and there is little to no work to be done on the JS side (aside from maintenance). Unfortunately, this is all written in Javascript, so has to be completely rewritten in Python if we want to do the same thing.
- An HTTP client in Rust - this should be an alternative for requests (in Python) and axios/fetch/... (in JavaScript). The standard HTTP clients in languages usually exercise very obvious behavior (using certain TLS ciphers, sending specific headers, etc.) and we cannot change this. Therefore - Rust (you can play with the TLS stack more there).
^--- This, we don't have anywhere.
Maturin could be useful.
Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
Hey
This may be useful to you.
There is a project - https://github.com/FlorianREGAZ/Python-Tls-Client which is a Python wrapper around the Golang library - https://github.com/bogdanfinn/tls-client
The disadvantage of python-tls is that it doesn't implement asynchrony. So this project is not suitable for crawlee-python at this stage.
But you might consider implementing your wrapper for https://github.com/bogdanfinn/tls-client.
Closing this one, as its content was divided into several smaller issues: #292, #401, #402.