clusterfuzz icon indicating copy to clipboard operation
clusterfuzz copied to clipboard

Add options to log to stackdriver from non appengine/k8s bots

Open ParisMeuleman opened this issue 1 year ago • 5 comments

On some of our android fuzzing hosts, we don't have the fluentd server that takes care of propagating the python logs to stackdriver. I suggest this change that adds the option to log there directly. It was a bit of a mess to add the additional fields given that we are stuck with a relatively old version of the google cloud logging api. I also am in a situation were we have three GCP projects involved (one for the service account, one for the queues and one for the logging). Also I added a bunch of variables to make it customizable (To chose which to activate in the console/file/fluentd/gcp logs). Alternatively, we might be better of using an input dictconfig or fileconfig (https://docs.python.org/3/library/logging.config.html).

ParisMeuleman avatar Jan 16 '24 09:01 ParisMeuleman

/gcbrun

marktefftech avatar Jan 30 '24 19:01 marktefftech

@oliverchang I just realized that you had #3422. I am not sure how relevant this is once you land it, but meanwhile, we have limited logs on our chrome android hosts. Given the specificities of those hosts (logs logged in a different project than the service account and the cloud_project_id), we would still need some of the configurability we have here.

ParisMeuleman avatar Feb 01 '24 14:02 ParisMeuleman

@oliverchang gentle ping on this one :)

ParisMeuleman avatar Feb 08 '24 09:02 ParisMeuleman

Hmm, I opened https://github.com/google/clusterfuzz/pull/3422 for exactly this :) But we are blocked on a Python upgrade for that.

I couldn't get structured logging to work with the older client libraries. Does your PR make that work?

oliverchang avatar Feb 08 '24 22:02 oliverchang

Not really, I worked around this issue, by adding labels, see example below. That's ugly TBH, but as mentioned we're missing out on python logs on several of our bots (the ones hosted by android's fuzzing infra) I also kept the existing (fluentd) behavior, as I wasn't sure we wanted to remove that or not. Long term (i.e. when python is updated) I believe we should have only the content of your PR.

log: { "insertId": "5tvjopf2hctof", "jsonPayload": { "python_logger": "android_heartbeat", "message": "Android device 2C011FDH3007VB state: device" }, "resource": { "type": "global", "labels": { "project_id": "google.com:clusterfuzz" } }, "timestamp": "2024-01-15T13:57:36.444799Z", "severity": "INFO", "labels": { "compute.googleapis.com/resource_name": "pmeuleman1.roam.corp.google.com", "fuzz_target": "null", "location": "{\"path\": \"/mnt/scratch0/cf/clusterfuzz/src/python/bot/startup/android_heartbeat.py\", \"line\": 53, \"method\": \"main\"}", "task_payload": "null", "bot_name": "android-haiku-chrome-test", "worker_bot_name": "null", "extra": "{}" }, "logName": "projects/google.com:clusterfuzz/logs/python", "receiveTimestamp": "2024-01-15T13:57:36.465772799Z" }

ParisMeuleman avatar Feb 09 '24 16:02 ParisMeuleman

LGTM, but are you able to check this doesn't break existing fluentd logging?

You can test this by following https://github.com/google/clusterfuzz/blob/master/local/README.md#running-a-bot-locally to run a Docker container locally that closely replicates a GCE bot.

I tested locally but at that time I did not know how to start a local fluentd, so for that part I was validating with a netcat that the results were similar to what is sent in our production clients.

I managed to have it running now that I understood how it was installed, and got the following logs (I used both LOG_TO_GCP=TRUE and LOG_TO_FLUENTD=TRUE, hence the dupes): https://pantheon.corp.google.com/logs/query;query=SEARCH%2528%222C011FDH3007VB%22%2529%0A-protoPayload.methodName%3D%22google.logging.v2.LoggingServiceV2.ReadLogEntriesLegacy%22%0A-protoPayload.methodName%3D%22google.logging.v2.LoggingServiceV2.AggregateLogs%22%0Aseverity%3E%3DINFO;storageScope=storage,projects%2Fgoogle.com:clusterfuzz%2Flocations%2Fglobal%2Fbuckets%2F_Required%2Fviews%2F_AllLogs,projects%2Fgoogle.com:clusterfuzz%2Flocations%2Fglobal%2Fbuckets%2F_Default%2Fviews%2F_AllLogs,projects%2Fgoogle.com:clusterfuzz%2Flocations%2Fglobal%2Fbuckets%2F_Default%2Fviews%2F_Default,projects%2Fcluster-fuzz%2Flocations%2Fglobal%2Fbuckets%2F_Default%2Fviews%2F_AllLogs,projects%2Fcluster-fuzz%2Flocations%2Fglobal%2Fbuckets%2F_Default%2Fviews%2F_Default,projects%2Fcluster-fuzz%2Flocations%2Fglobal%2Fbuckets%2F_Required%2Fviews%2F_AllLogs;summaryFields=:false:32:beginning;cursorTimestamp=2024-02-19T16:31:38.310903463Z;startTime=2024-02-19T16:30:00.000Z;endTime=2024-02-19T16:32:00.000Z?project=google.com:clusterfuzz&e=-13802955&mods=logs_tg_prod&pli=1

ParisMeuleman avatar Feb 19 '24 16:02 ParisMeuleman

/gcbrun

oliverchang avatar Feb 19 '24 22:02 oliverchang