lemmy
lemmy copied to clipboard
Implement a way for a user to export their account data.
Most social media platforms have a way to export all of your account data into a downloadable zip file, including all your posts, comments and uploaded files. This is good for privacy as it allows the user to easily check what information the website has on them, and also allows the user to back up or archive their own data offline. I believe GDPR and CPPA also requires this feature, at least for the user's personal information (along with a way to permanently delete that information), so we should add it to the list of things to implement before Lemmy gets out of beta.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
Okay so what kind of data does this need to include? And what should be the goal of this export, to import it somewhere else, or to view it locally?
Okay so what kind of data does this need to include? And what should be the goal of this export, to import it somewhere else, or to view it locally?
I would try to export anything that can be attributed to a user, but it should at least include posts, comments, votes and settings. Also images uploaded to pictshare and maybe things like subscriptions. At the very least anything that contain the user's personally identifiable information should be exportable in compliance with privacy laws.
As for what it's for, I think the main goal would be to have an easy way to back it up at first. However, a restore system would make a lot of sense when Lemmy becomes federated to allow users to switch instances without losing their content.
This isn't too difficult, as a lot of the back-end DB calls can already filter by user. So this is kind of just creating a wrapper for a lot of that data. Its just wayyy down in priority for me.
With the current surge of users migrating from Reddit, this feature would allow us to relieve some pressure of lemmy.ml, whose amount of active users has grown 683% since Reddit's API announcement
I'd say exporting and importing subscriptions would be a good first step in easing users ability to migrate instances.
@TheYang see #3040 for just importing a list of community subscriptions with a single-click
There's a bookmarklet workaround to get a list of subscribed communities, useful for if you're moving to another instance: https://feddit.de/post/808717
I'd like to recommend that this be implemented in a way that it can be offloaded to a cron job or something, possibly performed on a different service entirely. From experience, database exports can be expensive for accounts with a large amount of data, and if there's ever a rush of multiple at once it can bog down the server or database.
My initial approach for a problem like this:
- Make an export queue DB table
- Make a script/program that polls the table periodically and grabs the newest row(s?)
- Lock the row(s) it's processing (with a timeout)
- Do the work and upload the zip file somewhere
- Email a link to the user (and/or put it on a web page accessible from the UI)
If an export fails or a row lock times out (ie, because the script crashed), unlock the row and increment a retry counter, then retry the export after some backoff interval up to some max number of retries where we report the failure to the user (and admins?).
Should probably also only keep the last N exports and only for M days. This can be configured in something like an S3 bucket.
I'd like to be able to backup my saved posts/comments as well.
I elaborated more in #3040 (https://github.com/LemmyNet/lemmy/issues/3040#issuecomment-1631776285), but besides what has already been requested here, being able to migrate user and community blocks is also super important.
Nice to haves are things like being able to migrate profile settings like theme, default sort, etc. These are more problematic though because they are still changing a lot.
https://github.com/LemmyNet/lemmy/pull/3976
Not sure if I should make a new issue for this, but the Import/Export Settings section should have information about how it handles adding or overwriting data. I read on a 3rd party site that Lemmy's import is additive in nature and doesn't overwrite anything, and that's what gave me the confidence to try it. But that information should be in the Lemmy settings area, and I also don't want certain things added to blank entries. For example, nearly everything in the first section of the json before it starts the followed_communities
.
"display_name":xxx,"bio":xxx,"avatar":xxx,"banner":xxx,"matrix_id":xxx,"bot_account":false,"settings":{"id":xxx,"person_id":xxx,"email":"xxx",
Etc.
@MaximilianKohler You can remove parts of the export file so they dont get imported. About adding texts to the user interface, please open an issue in lemmy-ui.
You can remove parts of the export file so they dont get imported.
I tried that and it caused errors. https://github.com/LemmyNet/lemmy/issues/4307#issuecomment-2205888648
New lemmy-ui issue: https://github.com/LemmyNet/lemmy-ui/issues/2580