designcourse icon indicating copy to clipboard operation
designcourse copied to clipboard

[GDPR] Ip Anonymisation

Open rolebi opened this issue 4 years ago • 22 comments

Hi,

Is there a way to anonymise the IP address that is stored with each log ? It is a requirement to be compliant with privacy regulations in Europe (GDPR/E-Privacy).

The full IP shouldn't be available at all in the source data and shouldn't be used for GeoIP resolution. Only the "pseudo-anonymized" IP can be stored and used for GeoIP resolution.

For a lack of official sources I redirect you to the GA documentation: https://support.google.com/analytics/answer/2763052?hl=en

This feature is designed to help site owners comply with their own privacy policies or, in some countries, recommendations from local data protection authorities, which may prevent the storage of full IP address information.

The IP anonymization feature in Analytics sets the last octet of IPv4 user IP addresses and the last 80 bits of IPv6 addresses to zeros in memory shortly after being sent to the Analytics Collection Network. The full IP address is never written to disk in this case.

rolebi avatar Mar 30 '20 13:03 rolebi

Hi @rolebi

This feature is coming very soon. The choice between all these options will be available in Datadog's interface.

hdelaby avatar Mar 30 '20 14:03 hdelaby

Great : )

rolebi avatar Mar 30 '20 14:03 rolebi

Very interested by this as well. We would love to use RUM and Logs, but because we can't anonymise IPs to be GDPR compliant we are not using it. Great that you are working on it :)

hereismass avatar Jul 03 '20 09:07 hereismass

Is there any news on this ? we're also looking to anonymize or have an option to remove network details altogether.

chrys-unito avatar Jul 27 '20 19:07 chrys-unito

We would also love to have this feature. @hdelaby is there any update regarding this?

What we did to make it work right now is:

  • We cloned the browser logs pipeline (to be able to manipulate it)
  • We disabled the geoip process
  • We added a new string builder process on the network.client.ip attribute path and replaced it with [removed]

Hope this will be helpful for the others. But we would need a proper solution for this. This is basically making this feature not usable for companies in the EU.

tchock avatar Nov 30 '20 00:11 tchock

Hi @tchock thanks a lot for raising this. Apologies for the delay! Here is the situation for now:

RUM It is possible to keep the geoip data (country, city, etc) while getting rid of the IP address. A configuration option will be available in a settings page in the UI in Q1. For now, these requests will need to go through [email protected].

Browser Logs The workaround suggested above is the right one if you want to remove all geoip information and will be documented appropriately. We will also document an alternate version of this workaround in order to keep all geoip information without storing the IP address. Our support will also be able to help configure it.

The vast majority of users actually need the IP address and geoIP data, which is why it is enabled by default. On logs specifically, we are stuck with how integrations pipelines work: there's no simpler way to customize them. Once again thank you for the patience here. I will answer with the appropriate documentation links once it's live.

hdelaby avatar Dec 01 '20 15:12 hdelaby

Any updates on this? :D

henningms avatar Apr 27 '21 09:04 henningms

+1 for updates. There's mention of the workaround above being documented. Did this ever happen? Many thanks

willhowlett avatar Aug 02 '21 19:08 willhowlett

any news ?

omaratpxt avatar Aug 24 '21 05:08 omaratpxt

Update?

alexander-schneider avatar Oct 01 '21 11:10 alexander-schneider

This feature is coming very soon. The choice between all these options will be available in Datadog's interface.

Hi, @hdelaby Any updates on the feature?

AdelUnito avatar Sep 29 '22 17:09 AdelUnito

Hello,

The situation is still the same, to remove IP addresses from RUM data, you need to go through [email protected]. We still want to build something in the UI and have planned work around that but no ETA to share yet.

We'll let you know here if we have any update on the topic.

bcaudan avatar Sep 30 '22 08:09 bcaudan

@bcaudan , we don't use RUM, but we still want to avoid logging IP/geo in the regular browser intake. We've contacted support, and they've only linked us to beforeSend etc, which ofc does not work. AFAIK, the network part is added to the logs not in the SDK here, but on the ingestion level (or similar, outside of our control).

Any recommendation? Is that something support is able to solve similar to for RUM?

johnkors avatar Oct 24 '22 09:10 johnkors

@johnkors for browser logs, did you tried the mentioned workaround?

bcaudan avatar Oct 24 '22 10:10 bcaudan

@bcaudan Not sure how that would work for browser logs. We're never sending anything related to network. It's appended at datadog servers.

johnkors avatar Oct 24 '22 10:10 johnkors

@bcaudan , we don't use RUM, but we still want to avoid logging IP/geo in the regular browser intake. We've contacted support, and they've only linked us to beforeSend etc, which ofc does not work. AFAIK, the network part is added to the logs not in the SDK here, but on the ingestion level (or similar, outside of our control).

Any recommendation? Is that something support is able to solve similar to for RUM?

One workaround to ensure that the IP/Geo information is never forwarded from the clients to Datadog regardless of whether it's stored or not (would still show up in access logs etc) is to setup a simple HTTP proxy between your clients and Datadog.

henningms avatar Oct 24 '22 10:10 henningms

@henningms Hi ;) Yeah, that's our last resort.

johnkors avatar Oct 24 '22 10:10 johnkors

@henningms Hi ;) Yeah, that's our last resort.

Hi! 😂

It's quickly becoming my default in the projects 😅 Allows us to control what is sent and eases the minds of the legal/GDPR team

henningms avatar Oct 24 '22 10:10 henningms

@johnkors for browser logs, did you tried the mentioned workaround?

@bcaudan Not sure how that would work for browser logs. We're never sending anything related to network. It's appended at datadog servers.

The mentionned workaround allow you to customize what is done by datadog servers.

bcaudan avatar Oct 24 '22 11:10 bcaudan

The mentionned workaround allow you to customize what is done by datadog servers.

Sorry, I misread "cloning" as a code change in this repo (as in a fork). My fault. I'll try out the pipeline mods. Thanks.

johnkors avatar Oct 24 '22 11:10 johnkors

Any update on this request? Seems like a feature that many would find useful. The workaround mentioned above might not be viable for everyone.

JacquesDoubell avatar Jul 18 '23 13:07 JacquesDoubell

Hello,

Here is the current state:

RUM You can choose whether or not you want to include IP or geolocation data from the Datadog UI, more details in the doc.

Logs You can remove geolocation data by:

  • cloning the browser logs pipeline (to be able to manipulate it)
  • disabling the geoip processor

You can anonymise the IP by:

  • creating a new pipeline after the browser logs pipeline
  • adding a string builder processor to replace network.client.ip attribute value with [removed]

bcaudan avatar Jul 25 '23 09:07 bcaudan