structlog-gcp
structlog-gcp copied to clipboard
LogFieldSanitizer for BigQuery
I can't recall whether I mentioned this, but at some point I ran into errors when I threw something into a log-field that was a list of lists. The following works around it in my particular setup:
"""Sanitize log-fields for backends"""
import structlog
from structlog.types import EventDict, Processor
class LogFieldSanitizer:
"""
Google Logging can back onto Log Sinks, which in turn are stored in BigQuery.
BigQuery has at least one limitation; it cannot store lists of lists.
Reference: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#array_type
This structlog processor adjusts for that if a log-field is a list of lists.
"""
def setup(self) -> list[Processor]:
return [self]
def __call__(self, logger: structlog.typing.WrappedLogger, method_name: str, event_dict: EventDict) -> EventDict:
del logger, method_name # unused
for key in event_dict:
if isinstance(event_dict[key], list):
orig = event_dict[key]
if any(isinstance(x, list) for x in orig):
event_dict[key] = [x for xs in orig for x in xs]
return event_dict
@petemounce thanks for the report! Would you be interested in contributing a PR to add this filter in the library?