PyFunceble icon indicating copy to clipboard operation
PyFunceble copied to clipboard

FEATURE: user_agent: random

Open spirillen opened this issue 1 year ago • 8 comments

Description

As I'm working my way through the new PyFunceble.yaml I stumbled on the fact we are hard coding the use of browser and OS, Somehow this just seems wrong, as we know, then the big5 and their wannabees, are logging the IP/Browser/OS etc, and sharing these data, to fetch bots and gives us FP-responses on http codes, as we receives false 4xx, 5xx HTTPS codes.

Possible Solution

One of two

  1. Randomize browser and OS per -n records (Optimal)
  2. Randomize browser and OS per test starts

Considered Alternative

Manual labor, bad, the system is MY slave, not the other way around.

Additional context

UPDATE: typos

spirillen avatar Jun 23 '24 10:06 spirillen

Yeah, but what does random means for you?

Our user_agents are getting rotated every week - when not every day... And we choose a random one from one of the latest / most used ones:

https://github.com/funilrys/PyFunceble/blob/fd2ce92208336300df07b00ce3995956b81ad62e/PyFunceble/dataset/user_agent.py#L225-L226

funilrys avatar Jun 23 '24 20:06 funilrys

And we choose a random one from one of the latest / most used ones:

I see tree questions here

  1. Selection among browser + OS
  2. How to control
  3. You(me) ask because

  1. Automatically rotate among all the available ones from the list (forgot the url), but you probably remember it. to the list of options
  2. by a switch of Random vs chrome,Linux in the pyf.owerwrite.yml
  3. Because as of now, 1 I didn't know it was rotated by default, 2. in pyf.owerwrite.yml. you can only set a firm value

spirillen avatar Jun 23 '24 20:06 spirillen

Come to think of... and YES it is a though only, can you spoof the mac addresses used for querying?

IF == True

Would it be an idea then to periodically with hourly randomness generate a set of "Clients, running its own MAC + Browser + OS"? ... IF == TRUE Could this help making it looks like a random number of users on the same network vs one client from the same network?

spirillen avatar Jun 23 '24 21:06 spirillen

Well, look here under the @modern datasets. What you give PyFunceble is the keys to a list of user-agents.

I'll have to think of a way to implement randomness. That won't be the default though.

funilrys avatar Sep 22 '24 10:09 funilrys

Let's clarify ...

  1. The reference file is rotated every few days: https://github.com/PyFunceble/user_agents/blob/master/user_agents.json
  2. Every PyFunceble instance in the world tries to fetch that file; if the local version is older than 1 hour.

In the end-choice, user have to choices:

  1. Set their own user agent through the user_agent.custom key:

https://github.com/funilrys/PyFunceble/blob/296a613f2e4149f8e790cd92ef4b964694edcac6/PyFunceble/data/infrastructure/.PyFunceble_production.yaml#L630

  1. Set a preferred user_agent.platform and user_agent.browser:

https://github.com/funilrys/PyFunceble/blob/296a613f2e4149f8e790cd92ef4b964694edcac6/PyFunceble/data/infrastructure/.PyFunceble_production.yaml#L610-L621

When end-user choose the later, the engine takes the given user_agent.platform as a subkey of the @modern key and the user_agent.browser as a subkey of the resulting dataset of the previous query. The result will be a list of user-agents. From that list, the engine will always select a random one.

https://github.com/funilrys/PyFunceble/blob/296a613f2e4149f8e790cd92ef4b964694edcac6/PyFunceble/dataset/user_agent.py#L255

That's how it works - right now.

So, you are suggesting, that we should an extra random layer / option ?

funilrys avatar Dec 27 '24 17:12 funilrys

Let me see if I understood this, for dummies.

Even I set a browser.agentX and platform.OXY, then the engine are mixing any of the @morern browsers, versions, etc

Then what about the OS.platform and screenSize?

spirillen avatar Dec 28 '24 03:12 spirillen

We don't simulate screens. We are not selenium or a browser (- yet ??).

funilrys avatar Mar 16 '25 15:03 funilrys

The os platform is always linux. Unless specified differently.

cf: https://github.com/funilrys/PyFunceble/blob/a05f2af03a0d951e6aac998e7dcb7416abfa8294/PyFunceble/data/infrastructure/.PyFunceble_production.yaml#L621

funilrys avatar Mar 16 '25 15:03 funilrys