rust icon indicating copy to clipboard operation
rust copied to clipboard

std::unix::stack_overflow::drop_handler addressing todo through libc …

Open devnexen opened this issue 1 year ago • 15 comments

…update

devnexen avatar Jun 09 '24 20:06 devnexen

Wow - this is fantastic. Thanks for this PR - could we also monitor how many /search requests the searxng instance has received from users and how many of these were successful/failed and their HTTP error codes?

It would help me in detecting and mitigating bot attacks on my instance.

vojkovic avatar Sep 18 '24 10:09 vojkovic

how many /search requests the searxng instance has received from users and how many of these were successful/failed and their HTTP error codes?

That should both be doable, yes.

Bnyro avatar Sep 20 '24 14:09 Bnyro

I've implemented the total request count. The error stuff should probably be done with a different PR, since it's more complicated to display due to different error types and I want to get the basic OpenMetrics implementation merged first.

Bnyro avatar Sep 20 '24 14:09 Bnyro

how many /search requests the searxng instance has received from users and how many of these were successful/failed and their HTTP error codes?

That should both be doable, yes.

The /metrics should not readable for the public .. not sure what the best is; a token option tokens: [ 'my-secret-token' ] like we have in the engine section .. or a HTTP realm authenticate .. or should we add OAuth to SearXNG?

return42 avatar Sep 28 '24 09:09 return42

The /metrics should not readable for the public ..

/stats is public anyways, so people can scrape the stats no matter if you disable or enable Open Metrics?

Bnyro avatar Sep 28 '24 09:09 Bnyro

/stats is public anyways, so people can scrape the stats no matter if you disable or enable Open Metrics?

Yeah, but I had the request from https://github.com/searxng/searxng/pull/3835#issuecomment-2363810790 in mind .. if we intend to include more data into the /metrics than we should consider to restrict access ..

I remember that we had a lot of discussions in the past regarding /stats and to what extent this information could violate the privacy of the users (e.g. if there is only one user on the instance, then /stats may already say a lot about the user behavior).

I would like to avoid or at least reduce these discussions for the /metrics URL in the future by subjecting access to certain restrictions (this was, what I had in mind).

return42 avatar Sep 28 '24 10:09 return42

This is awesome! Was just looking for this sort of functionality. Anything you guys would like me to test so this can be merged?

cohenchris avatar Sep 30 '24 01:09 cohenchris

not sure what the best is; a token option tokens: [ 'my-secret-token' ] like we have in the engine section .. or a HTTP realm authenticate .. or should we add OAuth to SearXNG?

OAuth is by far too complicated for our use case. I think that the best way would be a token that must be set in settings.yml and passed as Authentication header when requesting /metrics, however Prometheus does not support that at the moment.

The alternative would be to limit access to that path via the Reverse Proxy used by the server admin, HAProxy and Nginx allow you to restrict access to certain paths to require HTTP authentication for instance.

Or we could just use Basic Auth directly as per https://prometheus.io/docs/guides/basic-auth/, that's probably easy to implement too.

Bnyro avatar Oct 02 '24 14:10 Bnyro

imho: it's a common pattern for prometheus metrics uri to be "open", it's an admin duty to make the "let's limit that" call, so I agree (from my humble point of view) with @Bnyro (great job btw).

Another (common) option could be to expose the metrics uri in another port, for instance 8081

brokenpip3 avatar Oct 02 '24 17:10 brokenpip3

imho: it's a common pattern for prometheus metrics uri to be "open", it's an admin duty to make the "let's limit that" call, so I agree (from my humble point of view) with @Bnyro (great job btw).

A +1 for public metrics. Let's not think too hard about this and overcomplicate a very simple feature.

We should just have a setting in settings.yml to enable metrics with default: off (as has already been done by Bnyro).

vojkovic avatar Oct 02 '24 17:10 vojkovic

A +1 for public metrics. Let's not think too hard about this and overcomplicate a very simple feature.

I have been involved in many discussions in the past about how much data can be collected, where it can be stored and whether this data can be made available. There is always a conflict here: from a privacy perspective, the best data is data that is not collected in the first place ... but a minimum amount of data is required for the reliable operation of an instance.

With these experiences in the background and the promise to our users to protect their privacy to the best of our knowledge and belief, it is difficult for me to simply give analytics data to the public.

We go to great lengths to live up to our promise ... both technically and organizationally. To name two examples: Even the data we store in the valkey (aka Redis) database is encrypted because we cannot know who is hosting this database and we want to avoid a hoster having access to the data in this database. Another example are the conditions to be accepted at searx.space, where one of the access conditions is: “I control the final webserver (software) that is serving the requests to the users of my instance”.

As a technician, I have great sympathy for simple solutions ... but when it comes to protecting privacy, we have to be prepared to go the extra mile.

It won't be long before we receive PRs that want to release even more data via this interface and then we are always back to weighing up privacy against technical necessity. We can avoid these discussions if we decide now to limit access to metrics to authentication.

I would therefore suggest that we implement HTTP Basic Auth and the password is stored in the settings.yml:


general:

  open_metrics: <password>

return42 avatar Oct 03 '24 08:10 return42

Just an FYI, we have an open bug for documenting the metrics system. Do we want to document this as well while we are working on it?

https://github.com/searxng/searxng/issues/1747

glanham-jr avatar Oct 04 '24 03:10 glanham-jr

Regarding privacy, an additional safeguard may be to have a configuration to enforce that the the metrics endpoint must be queried from the same network (I.e. localhost)? Unsure if this is possible from SearxNG itself.

I imagine this would solve the use case for open access of a Prometheus instance on the same machine / network, where I imagine most simple servers would have. This would block all other attempts to access with a 403. Again, this may be a configuration better suited for a reverse proxy or something else, just throwing ideas out.

As an aside, how does searx space get the necessary information? Does that not scrape any metrics and only general information?

glanham-jr avatar Oct 04 '24 03:10 glanham-jr

As an aside, how does searx space get the necessary information? Does that not scrape any metrics and only general information?

searx.space requests /config and it also sends a few requests to /search. The IPs to check.searx.space are whitelisted in the limiter (within searxng's source).

vojkovic avatar Oct 04 '24 05:10 vojkovic

to have a configuration to enforce that the the metrics endpoint must be queried from the same network (I.e. localhost)?

I do not expect the data collection point to be on localhost or in the subnet.

return42 avatar Oct 04 '24 06:10 return42

Added basic auth now.

For viewing the metrics via basic auth, you must pass:

  • the username (can be anything, doesn't matter)
  • the password (set in settings.yml)

For example, this can be tested by using a HTTP client such as xh with xh --auth <randomusername>:<password> http://127.0.0.1:8888/metrics.

Bnyro avatar Oct 15 '24 09:10 Bnyro

Added some basic documentation about it now, should be sufficient to understand how to use it.

Ready to merge from my side, anything left to do from your side @return42 ?

Bnyro avatar Nov 24 '24 12:11 Bnyro

CI failure seems unrelated to this PR.

Bnyro avatar Nov 24 '24 12:11 Bnyro