SpringAll icon indicating copy to clipboard operation
SpringAll copied to clipboard

Provide a Rake task to delete local/remote profile data as per GDPR requirements

Open jaywink opened this issue 5 years ago • 13 comments

Received my first GDPR user data deletion request for the pod I run. Currently Diaspora doesn't allow deleting all the data related to a local and/or remote profile. Under GDPR, users have the right to request the right to be forgotten.

As per #6221 for example when a user deletes their account, their comments will not be deleted, which AFAICT is a direct violation of what GDPR requires. Additionally podmins don't have the possibility to delete local / remote profiles.

At the minimum, it would be nice to have here a Ruby snippet to delete all contents/posts/replies/etc of a remote user. Preferably before September 23rd, when I will be sued if I don't comply.

jaywink avatar Aug 23 '18 20:08 jaywink

We do have such a script we used when all the EI accounts arrived. @SuperTux88 wrote it if I remember correctly. The question I'm wondering is, does breaking the link between the users and their comments by making them anonymous is enough regarding GDPR, or should them be removed completely?

Flaburgan avatar Aug 24 '18 07:08 Flaburgan

You would need a lawyer to comment on that. But breaking the link between the user and their content will hardly pass any court room successfully. Why delete anything? Just break the link between the users posts and photos and you don't need to delete anything then. I'm sure if GDPR allowed that it would say "you don't need to remove the data, just make sure it can't be found through the user". Instead it does say "you must remove the data" - and also from internal databases, not just public serving databases.

I'm not sure as a platform that advertises using the word "privacy" it is a good idea to deny internet uses the right to remove their content. Even Facebook allows you to remove data, but they are GDPR compliant.

Any idea where I could find that script? I had some old script but it didn't work any more due to some internal changes I guess.

Thanks

jaywink avatar Aug 25 '18 17:08 jaywink

might be this one? https://gist.github.com/jhass/9104029

Waithamai avatar Aug 30 '18 19:08 Waithamai

Thanks @Waithamai will try it!

In this case it might be enough since the user using their GDPR rights is not on my server. But the script as far as I understand doesn't delete local comments, so it wouldn't be sufficient if the user was local.

jaywink avatar Sep 01 '18 17:09 jaywink

It works partially. What of course isn't removed or the person and profile attributes. So I did some console data mangling. One interesting thing is the diaspora handle, which had to be changed to something unidentifiable too, since it can certainly be considered PII. Probably will cause interesting exceptions since the person guid is taken (and public key blocked), but handle is not found in the db :P

jaywink avatar Sep 01 '18 18:09 jaywink

ActivityPub was not designed with GDPR in mind.

I don't understand how that could be possible. You simply cannot enforce the recipient of a delete request to enforce your request. This is not a problem of ActivityPub, it's a problem of the federated model. And not even that. The problem exists on centralized platforms too. If my app uses a commercial logger system for example to store logs, and they claim they will delete them, thus satisfying GDPR requirements to my customers (if I tell them logs are stored there). But then when I inform them to delete the logs and they don't, there is nothing I can do. Federated platforms just "suffer" from this problem more, since they by design push data around.

Saying this, the Diaspora and ActivityPub protocols offer the same possibilities for applications to implement GDPR, if such a thing can even be said, since implementing GDPR is not really a protocol thing, it should go through application design as a whole.

Some good reading: https://www.smashingmagazine.com/2017/07/privacy-by-design-framework/

jaywink avatar Sep 02 '18 07:09 jaywink

Btw, with my project, Socialhome, I went with the Matrix GDPR implementation, ie doing what can be done locally but ensuring that users understand and accept that data is pushed to an unknown amount of other servers which are out of my control.

See https://matrix.org/blog/2018/05/08/gdpr-compliance-in-matrix/ . Personally I think this is the best model I've seen so far.

jaywink avatar Sep 02 '18 07:09 jaywink

How does GDPR handle email? If you email me from a GDPR country, there’s no deleting that content from my inbox, server, etc.

if a receiver is not ready to follow EU or Californian rules, then they should not accept my content.

This is undermining the whole purpose for of federation, where pods shouldn’t be beholden to one government’s rules about e.g., censorship. If you don’t want my server handling your data, don’t send it.

koehn avatar Sep 02 '18 18:09 koehn

Yes, as it stands, the whole federation is one step from being dismantled since most servers are in Europe. Obviously one way would be to move the server in a jurisdiction that don't care about human rights.

That is simply not true. I think you misunderstand what GDPR means. Server location means nothing in GDPR. GDPR and federated servers can be friends as long as the users are given the right tools and the policies are clearly documented. This is not the case with Diaspora, but that doesn't mean you can't have a federated model with GDPR built into it. I believe Matrix have done a good job here, please read the post linked before.

Anyway, this is the Diaspora issue tracker and this issue is not about GDPR vs federated models. Before the project admins jump on this and lock it, maybe better move the discussion to another more relevant place.

And btw, he/she who controls the relay servers on the federation is the master of it all.

This is totally off-topic, even to the GDPR discussion, since the relay servers don't store PII at all - they just push stuff around. If the podmins sending stuff to the relay don't highlight this in their privacy policy, they are failing their users. That is not the issue with the relay system - no server has to send stuff to them.

I am the admin of one relay server, currently I know of two. I'm not sure how that should make me feel a master of it all, but it doesn't really excite me that much. TBH, mostly it has been a nuisance :D

jaywink avatar Sep 02 '18 19:09 jaywink

Besides, unless you haven't noticed, there are very few developers on the federation and diaspora software.

Actually quite the contrary, the federated platforms have more developers than ever before.

If these developers get sued, I doubt the federation will continue to exist as it stands if they don't comply with existing laws in their countries

So far I've only heard of one person intending to sue the developers - you.

( most if not all core dev people are in Europe ).

Please don't confuse Diaspora with all of the federated networks. Is your disagreement with Diaspora or federated networks as a whole? In the first case, yes I believe you are probably right. In the latter case, no.

+1 these feature is a legal nightmare, people don't even understand or know it exists. I don't share with anyone on joindiaspora.com ( located in Germany ) and yet, my posts end there :/

This could happen through reshares, which is basically what the relay only does. It's not a problem of the relay software if the privacy policy of your server doesn't indicate it. It's like blaming Sentry for having PII when a platform integrating to it sends it.

I have never intended to really sue you all

So you just go around the internet threatening to sue people who spend their free time trying to make tools for other people without the actual intention?

I am just a security QA guy, that-s what I do for a living.

Privacy != Security.

I happen to also work in an area where GDPR is taken very, very seriously. I'm still not sure what your agenda is, but the way you have been bringing it out to people is not helping your case. Maybe try keeping all the other discussion as civilized as it has been on this thread. Thank you for that.

Still, this is again, horribly off-topic. We both agree that Diaspora is not GDPR compliant (I've said that many times here too). Let's stop spamming the developers with off-topic generic discussion in a place which it doesn't belong to.

jaywink avatar Sep 02 '18 20:09 jaywink

comments by making them anonymous is enough regarding GDPR, or should them be removed completely [in later comment by Jason:] You would need a lawyer to comment on that

Well. I asked one, and the answer is... who knows. The thing with GPDR is that it is one of those pieces of legislation written so horribly vague that it might be perfectly fine, or it might not, probably depending on the precedent's judge's mood. So... hah. Fun.

As @koehn noticed with the email example, GPDR lacks total understanding of how data distribution in complex and connected applications actually works. Even more hilarious, there is zero way for any legal entity to actually verify that user data actually was deleted, besides a letter from the application's legal team claiming so. As much as I am in support for stricter privacy law, they have to be implemented in a way that actually works. Anyway, this is not a legal forum, and I should probably stop here.

As for the original issue, I took the freedom to split it into two. #7855 is now tracking the option for end-users to delete and retract comments and likes on closing their accounts. I renamed this task to be about a rake task that can be used to delete user data. It should probably take a diaspora ID as parameter, check if it's a local user, delete and retract if so [insert sarcastic remark about how retracting content from remote pods isn't even needed according to GPDR], and just delete otherwise. This could also be used as an embedded tool for spam control, so it should probably be something like @jhass script, which should have been integrated and maintained by the core project a while ago. :)

I have marked this issue as blocking-release without a target release, just to make sure someone feels compelled to work on this with higher-than-average priority, without actually blocking any specific release.


Janitor work.

I took the liberty to hide shell script in the comment above to avoid people even considering running it. Manually manipulating the database is a bad idea in general, but it's even worse here. Deleting rows in the database does not delete some attached metadata, and it does not submit retractions for local content. Using the provided script for anything is just a bad idea.

Also, @indyfractal I deleted three of your comments as they were completely off-topic and behind edge of friendlyness. I'd probably delete more if people didn't respond. In any case, this is not a place for you to discuss your legal opinions, threat other people, make wild assumptions, or start philosophical discussions. This is a bug tracker, and diaspora*s bug tracker for that matter. Before contributing or commenting any further, please read this and this.

denschub avatar Sep 04 '18 03:09 denschub

So.... shouldn't we move the GDPR discussion to discourse? IMO that should be the place for discussion. The operational/technical part can be discussed here.

robbosch avatar Sep 11 '18 06:09 robbosch

With PR https://github.com/diaspora/diaspora/pull/8249 it will be possible to delete all users Data from an Admin view.

tclaus avatar Jun 14 '21 13:06 tclaus