operations icon indicating copy to clipboard operation
operations copied to clipboard

forum content migration to discourse

Open cquest opened this issue 2 years ago • 16 comments

Steps identified:

  • [x] dump or access to fluxBB mysql
  • [x] initial import test -> https://community.cquest.org/ (on most recent topics only for a first test)
  • [ ] retrieve osm_id and generate OAuth2 linkage
  • [ ] avoid duplicate accounts for existing Discourse accounts (dedup based on username)
  • [x] retrieve sticky topics flag ?
  • [x] fix quotes in posts
  • [ ] fix external links in posts
  • [ ] fix internal forum links in posts, recreate Discourse internal links ?
  • [x] recreate mapping between old topic/post id to new ones
  • [ ] link redirect to new URLs
  • [ ] automatic language detection on posts to add language tag to them
  • [ ] retrieve avatars ?

In order to minimize work time on your side, I propose to test the content migration on a fresh Discourse setup I can install temporarily on OSM-FR servers (I can give you access to it if you want).

For this I’ll need a read only access to the fluxBB mysql (or do a mysql_dump of it).

Regarding user private data, the script only needs these fields in the “users” table: id, username, realname name, url website, email email, registered created_at, registration_ip registration_ip_address, last_visit last_visit_time, last_email_sent last_emailed_at, location, group_id

If you prefer, you can limit the read only access to these fields on that table and, of course, I guarantee not to use personal data in any way outside of this migration process.

Script improvements

  • I’ve seen that sticky topics are not handled by the script, I’ll check if it is possible to improve that as well as getting the avatar for the users and some additional user preferences.
  • I’ll check also how to keep the mapping between id of old topics and posts and new ones, in order to have create the URL redirections.

Once a first migration test looks fine, we can share the URL of the temporary discourse to have more eyes looking at the result to fix residual problems and iterate if required.

Finally, if the migration looks ok and is globally approved, you’ll simply have to run the updated migration script on the real instance.

Is this process ok for you ?

Regarding timing, I can spend my next week-end on that.


Regarding login management, I didn't know the OSM account was mandatory for the fluxBB forum and also for Discourse login.

As @tomhughes mentionned on Discourse, this will require additional access to login details.

Was fluxBB modified to deal with that ?

cquest avatar Mar 23 '22 14:03 cquest

fluxBB has a few modifications to validate user names + password against osm.org. I think the following branch / repo reflects the latest status: https://github.com/openstreetmap/openstreetmap-forum/commits/openstreetmap-1.5.10 - about the latest 10 commits should cover the relevant bits.

mmd-osm avatar Mar 24 '22 10:03 mmd-osm

From the point of view of a conversion I think the important thing is that the user table has an extra osm_id column which contains the numeric user ID from the main site - virtually all the users have that set.

Obviously that needs to be populated to wherever discourse stores the OAuth account link so that if the user logs into Discourse it will match them to the existing account.

There's also the issue that forum usernames and emails may not match the OSM ones - probably more important for username though it should correct if the user logs in. Obviously conflicts might result though if a forum name matches a different OSM user!

tomhughes avatar Mar 24 '22 11:03 tomhughes

I've looked at how the OAuth2 plugin is storing its stuff in Discourse PG database.

Everything seems to be in the user_associated_accounts table establishing a link between the local user id (user_id) and the external one (provider_uid).

On my instance, provider_id contains the osm_id for the user. No email are stored in that table, and I think they are not needed for the migration.


id            | 2
provider_name | oauth2_basic
provider_uid  | 158826
user_id       | 17
last_used     | 2022-01-20 09:36:17.226735
info          | {"name": "cquest", "image": "https://www.openstreetmap.org/rails/active_storage/representations/redirect/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBOEJLQkE9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--b429b64e9fb72817d1459d7303d4543aa51db905/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaDdCem9MWm05eWJXRjBTU0lJU2xCSEJqb0dSVlE2RkhKbGMybDZaVjkwYjE5c2FXMXBkRnNIYVdscGFRPT0iLCJleHAiOm51bGwsInB1ciI6InZhcmlhd
GlvbiJ9fQ==--61dff5aaf2a9b9c1cf991c0b0a0ca7f92e6a9766/cquest-hackathon-anfr-portrait.JPG", "nickname": "cquest"}
credentials   | {"token": "dslksdjslkdj", "expires": false}
extra         | {}
created_at    | 2021-10-20 12:32:00.601371
updated_at    | 2022-01-20 09:36:17.227308

The OAuth2 plugin can update email address when someone logs in, this is one option of the plugin,.

If I'm not wrong in all the above, access to email addresses in the main OSM db does not seem necessary the one from fluxBB should be enough to setup the account and one additionnal record in this table should recreate the link with OSM OAuth.

cquest avatar Mar 25 '22 11:03 cquest

First test on a subset of posts (100 most recent topics) is visible on https://community.cquest.org/ It is a DEV install of discourse, I'll switch to a docker based one to be closer to the actual setup. No problem to create users, categories, topics, posts.

Users:

  • a few of them have no osm_id, none logged on the forum since 2016
  • a lot of emails missing, I hope the OAuth2 email sync will work as expected

Categories:

  • they are all created at the first level, I've reordered a few as subcategories manually to see what it could look like

cquest avatar Mar 31 '22 22:03 cquest

https://api.openstreetmap.org/api/0.6/user/details.json does not seem to return the user email even if logged-in.

Is it possible to add it ? This will allow the email sync/update on the first login.

cquest avatar Mar 31 '22 22:03 cquest

There's a special oauth scope that the application needs to have to get that and only administrators can create applications with that scope - it was added specially to support the discourse instance.

If it manages to update the username then I'm pretty sure it will manage to update the email as well if it's available.

tomhughes avatar Mar 31 '22 23:03 tomhughes

Looks like read_email scope, right ?

cquest avatar Apr 01 '22 09:04 cquest

Correct, but we already know that updating the name and email works because we tested it.

The key question is when a user with forum posts logs in for the first time, does it manage to find and use the account the conversion created or does it create a duplicate.

tomhughes avatar Apr 01 '22 10:04 tomhughes

@cquest I will have some time available soon. Can I pick some of the this up from you or anything you'd like help with? Would it be possible to share any migration code you have?

Firefishy avatar Jul 14 '22 15:07 Firefishy

I am behind in the migration. Recent illness has put me behind schedule.

Firefishy avatar Sep 30 '22 15:09 Firefishy

Would it be possible to give more time with the closing of the forum and extend the time by a month or two? You announced it just a month earlier and time is going slow on OSM. I haven't had time to get acquainted with the "community" and test or make any comments.

maro-21 avatar Sep 30 '22 16:09 maro-21

Don't worry - it's probably at least a month away still if I had to guess.

tomhughes avatar Sep 30 '22 16:09 tomhughes

Could you please update the banner on the forum with the new schedule? I worried a bit as dead-line passed and not many members moved over from the forum to the new community site.

stephankn avatar Oct 02 '22 14:10 stephankn

Could you please update the banner on the forum with the new schedule? I worried a bit as dead-line passed and not many members moved over from the forum to the new community site.

I have now updated the banner.

Firefishy avatar Oct 05 '22 14:10 Firefishy

As of today, we have disabled thread creation on the old forum.

https://community.openstreetmap.org/t/forum-osm-org-transition-announcement/2361/53

Firefishy avatar Oct 06 '22 17:10 Firefishy

The migration has been postponed for good reasons, but when will the actual migration take place?

Commodoortje avatar Oct 19 '22 09:10 Commodoortje

I don't think migration is good - but its to late anyway for that :)

So, the forum will stay open one more month with disabled thread-creation and enabled replies. Can't we update the banner with this information? Maybe with a "Replies will stay opened until 30.11.2022, then the forum will be read-only."

Its kinda weird to get to say "Well, the old forum is actually closed, but you still can write there."

natrius avatar Oct 28 '22 09:10 natrius