pixelfed icon indicating copy to clipboard operation
pixelfed copied to clipboard

import / export of user data

Open leuc opened this issue 7 years ago • 32 comments

A user wants to import/export data between (pixelfed) instances

Included data:

  • User profile
  • Uploaded images
  • Image descriptions
  • Image likes
  • Comments on images
  • Followers / Following

Open questions about the implementation

What intermediate format should be used?

  • [ ] JSON for direct migration between Pixelfed instances?
  • [ ] What JSON namespace?
  • [ ] Also export a existing XML namespace for a more generic export?
  • [ ] Follower / Following data compatible with other Fediverse Software?

Federated data

  • [ ] Exported comments and like data must include full reference to source instance user id (@[email protected]) of the person who wrote comment or liked the post.
  • [ ] However, the exporting user might want to change identity between instances. Exported data must clearly define what the user id was, so it can be mapped to a new id on import.
  • [ ] Import should not trigger federation?
  • [ ] Like/comment data could easily be faked? Verification on import? How?
  • [ ] Direct transfer between instances without the need of manual export / import? (Requires that both instances are available, which might not be the case in such situations.)

Images

User does not want to import/export images individually

  • [ ] Images could be packed into multi-volume archives?
  • [ ] Destination instance needs to know about the existence of all images before the upload?
  • [ ] Maintain or renew all image URL hashes on import?

leuc avatar Feb 06 '19 01:02 leuc

1) Format

1.1-1.3) JSON for direct migration between Pixelfed instances? namespace? XML for generic export?

Export and import should be in ActivityPub JSON format, to allow possible import into formats other than Pixelfed's internal database representation.

  • The namespace would be W3C's activitystreams
  • XML is not required at all for generic import/export -- ActivityPub is a standardized format already.

1.4) Follower/following compatibility?

Follower/following is not compatible with any other software right now because identity of Actors are dereferenced internally and ID system is up to each individual project to implement.

It is theoretically possible to reconstruct data perfectly with an offline import if you have a) the private key (to establish ownership of content/account) b) some globally unique identifier that is signed by the private key (to establish it is the same content/account), c) the following/follower lists, so that you can send an Update/Move of your Actor (and content?) to them with your new location (but this requires both softwares to support Update/Move of necessary fields, like id and url).

It is also possible to do an online import with just a) and b), if the original server is still online -- the original server can provide c) for you. It's also possible to drop b), since no agreed-upon global UID standard exists within ActivityPub; the expectation is that each implementation will choose its own internal system for identifying/deduplicating accounts. But this makes it harder to identify/deduplicate accounts consistently across softwares or servers.

The last 3 paragraphs are explained a bit more in-depth at #216 but this is what you should know, at least.

At minimum, you can export following list only, and have a new account re-follow all old accounts that you were following.

2) Federated data

2.1) Exported comments/likes

This can also be in ActivityPub JSON documents, but the expectation is that those activities are stored by the server of the Actor, not necessarily by your own server. You can still export a list of account names at the time of export -- there is no way to perfectly maintain the identities of each commenter or liker, and no way to import at all; the Actor's server must Update its own records or Move your posts to point to your posts' new locations.

2.2) Changing identity between instances

See #216

2.3) Import should not trigger federation

This is open to interpretation; the data and activities can be imported but with prior dates. Each software must negotiate data transfer of older content -- either your software redelivers new copies of the content, or you send Update activities of the new locations, or the remote software backfills statuses based on its own fetching strategies -- see https://github.com/tootsuite/mastodon/issues/34 for more.

2.4) Verification on import

See 2.1 -- this would be done by checking signatures on your own content (to ensure the private key matches the content), and then sending signed Update/Move to every other Actor involved.

2.5) Direct transport between instances

See #216 or the "online import" strategy described in 1.4

3) Images

3.1) Multi-volume archives

I don't see a particular reason why archives should specify a type of volume. This does however add the implication that software must support both partial import (append) and full import (overwrite).

3.2) Establish existence of images

Same as 2.4

3.3) Maintain image URLs

Same as #216 or 2.3/2.4.

trwnh avatar Feb 06 '19 03:02 trwnh

Adding my vote to this: this is kinda a big deal, I just realize we can even import from instagram backup but doesn't offer a way to backup our own.

Having this is a pretty big step towards "allow users to own their data", not to mention easy data migration between instances.

bitinn avatar May 31 '20 16:05 bitinn

What about the idea to be able to sync from Instagram to pixelfed? That way I can still Hootsuite. The idea is to get Hootsuite to support pixelfed for those wanting Instagram alternatives.

Goldmaster avatar Jun 26 '20 11:06 Goldmaster

Hello, sorry for dropping in our of the blue, but…

is this feature planned to be developed anytime soon? I am really looking forward to it.

xplosionmind avatar Jun 11 '21 05:06 xplosionmind

Extremely important request for user freedom, so just doubling down.

cloo avatar Oct 24 '22 21:10 cloo

Hello @dansup, Any news on this ? Thanks

KaKi87 avatar Jun 24 '23 17:06 KaKi87

Wait, do you mean that I can import from Instagram but not from another Pixelfed instance? o_O

h-2 avatar Jun 26 '23 00:06 h-2

I think devs have a different idea about priorities. I don't even notice them reacting much to github issues.


image


image

I see two signs of a short-term project: one man band and feature-creep (bells'n'whistles often prioritized against strategic strength). I hope @dansup will find ways to attract more collaborators before it becomes unmanageable.

cloo avatar Jun 26 '23 09:06 cloo

I think devs have a different idea about priorities. I don't even notice them reacting much to github issues.

image

image

I see two signs of a short-term project: one man band and feature-creep (bells'n'whistles often prioritized against strategic strength). I hope @dansup will find ways to attract more collaborators before it becomes unmanageable.

I'm working on IG Import right now, it may not seem like I respond to issues immediately, and usually it's because the reporter mentioned in in our Matrix or Discord and I ship a fix without closing or responding to relevant issues.

Will work on that, and I'm trying to make the project more encouraging to contributors.

dansup avatar Jun 26 '23 09:06 dansup

Thanks @dansup ! Don't get me wrong, you are a hero, and I'm really rooting for prosperity of Pixelfed. This is why I'm wasting my time reporting issues and also moods on the user side. Be careful not to burnout too, this is a security vulnerability for all the community!

cloo avatar Jun 26 '23 09:06 cloo

My two cents: I think a feature to export all the data (including photos, comments, like etc...) should have an higher priority, because it could help all the Pixelfed instances to be GDPR compliant https://gdpr-info.eu/art-20-gdpr/

In particular:

The data subject shall have the right to receive the personal data concerning him or her, which he or she has provided to a controller, in a structured, commonly used and machine-readable format

Also: Pixelfed instances are run by volunteers and they may disappear overnight. Being able to do a backup of all the data is quite important.

andreagrandi avatar Jun 30 '23 06:06 andreagrandi

Any news about exporting data for users with more than 500 statuses? Any tips while waiting for a new feature? 🫶

pylapp avatar Nov 21 '23 12:11 pylapp

I'd like the import/export feature for the reasons mentioned above. additionally it might be usueful for bulk uploading.

gidTT avatar Nov 27 '23 16:11 gidTT

I'm currently facing the problem of having to switch instances. I have backed up all the data from my old instance and can only find the option to import data from Instagram on the new instance. Is there actually no way to move existing data between two Pixelfed instances?

MatKlein avatar Dec 12 '23 08:12 MatKlein

Bump, because I'd really like to see this happen. Needing to migrate from one instance to another because my old instance is stuck at old version.

bigjdunham avatar Dec 25 '23 19:12 bigjdunham

We do support IG Import currently, but we need to improve our data export and fix the account migration bug.

I will keep this issue open until it is fixed.

dansup avatar Jan 03 '24 10:01 dansup

Is there any update on this feature? I'm looking to migrate to another instance as my current server is becoming increasingly unstable. It would be great to take my archive of photos with me.

grsharp avatar Jan 29 '24 09:01 grsharp

Yeah, I've more or less stopped using Pixelfed because not having a functional data export is a bit "shades of Meta / Instagram."

GittingAnAccount avatar May 31 '24 13:05 GittingAnAccount

Any news about this feature @dansup @shleeable?

I would like to have a backup of my statuses and collections, but it seems the feature is possible for less than 500 publications and I have much more. If this feature is not ready or planned yet, could you share if possible some documentation to help me make kind of script making such backups?

Thank you a lot :)

screenshot

pylapp avatar Jun 24 '24 20:06 pylapp