mlem icon indicating copy to clipboard operation
mlem copied to clipboard

Mlem HTTP Server

Open Sjmarf opened this issue 1 year ago • 7 comments

We need this for a number of features (list here).

Doing this will involve storing a user's auth token on the server. We should probably have it be a separate auth token from the Mlem app. Do we need to update the privacy policy if we want to do this?

I haven't looked that deeply into what languages or frameworks we could use for this. Using Swift might be a good idea, because we're already using it. Some brief researching suggests https://vapor.codes could be good? They've got built-in Apple Push Notification support.

Memmy's server is open source and written in TypeScript.

Sjmarf avatar Sep 08 '24 20:09 Sjmarf

Please ping me when you get around to the cost analysis, as I'd like to offer my support as a donor.

d42ohpaz avatar Sep 14 '24 17:09 d42ohpaz

We've got a donation page here and are grateful for any contributions :) We haven't spent any of it yet; in future we'll be spending it on the server described here. We haven't looked into how much a server would actually cost just yet - there's a good chance we'll end up buying a cheap option and then upsizing as needed.

Sjmarf avatar Sep 14 '24 19:09 Sjmarf

@Sjmarf Depending on the costs, I was thinking I might prepay you all one year of hosting.

d42ohpaz avatar Sep 14 '24 19:09 d42ohpaz

The current plan is to begin implementation once 2.0 leaves beta. We'll perform a cost analysis once we have that work and the associated server requirements properly scoped and a server solution identified. @d42ohpaz I'll update this issue when that's all done. Thank you for your generosity!

EricBAndrews avatar Sep 14 '24 22:09 EricBAndrews

Some thoughts on how the basic implementation could work.

IMO we should have a concept of an "account" on the server. In the server's database, each "Mlem account" would be tied to one or more Lemmy accounts.

The user wouldn't need to explicitly create an Mlem Account by setting a username and password - we could abstract that away to keep things as simple as possible. Instead of having credentials for the Mlem Account itself, the user would be able to "log in" to their Mlem Account using the credentials of any of the Lemmy accounts tied to their Mlem Account. Once logged into the Mlem Account, they would be able to add/remove Lemmy accounts from the Mlem Account.

We would need a way for the server to verify that you have logged in to a Lemmy account successfully. We could do this by sending the session token to the server, and having the server check that it's valid by sending a api/v3/site request to the Lemmy instance and checking the response matches what it expects.

I think that tying all of the user's Lemmy accounts together in this way is better than keeping them separate in the long run. We wouldn't need to do this for basic features such as #72, but for more complex features (such as shared blocklists) this would be useful.

One tricky implementation detail is how to create an Mlem account without sending a token to the server. We need to offer this option because some features such as #1547 don't require that the server has the account token. We shouldn't force the user to store the token on the server if they don't want to use features that require doing so, because that would be needlessly unsafe. I submitted an issue to Lemmy asking for a "read-only token" feature, which would largely resolve this. If that doesn't get implemented, some alternatives would be:

  • Having the user prove that they own an account by sending a private message to a certain account in Lemmy (I'm not sure whether this method is 100% trustworthy)
  • Allow the user to create a username/password for the Mlem account itself.

Sjmarf avatar Jan 24 '25 10:01 Sjmarf

Some thoughts on this front:

Timing

We should start building this when Lemmy 1.0 releases. Push notifications should be supported in Lemmy 1.1, so if we build some simpler functionality ahead of time, we can work out the major infrastructure kinks ahead of time and hit the ground running with push notifications.

User Identification/Account Grouping

the user would be able to "log in" to their Mlem Account using the credentials of any of the Lemmy accounts tied to their Mlem Account

I disagree. This rests on the assumption that one Lemmy account will only be used by one Mlem user, which should hold for the majority of cases but is not universal enough to build an architecture around; shared social media accounts are unusual but not nonexistent.

The dependencies of the scoped feature set are somewhat complicated:

Feature Issues Device Token Lemmy Account Mlem Account
Message notifications #72
Subscribe to user-relevant mod activity #1548
Subscribe to post/comment/community #1547, #1551
Trusted hosts #1302
Info sync #1650
Mlem Developer flairs #1549
Search categories #1196

The no/no/no rows all indicate general data that the server provides at the client's request. These are trivial to handle.

For the other cases, the server needs to be able to do the following:

  • Store Lemmy Accounts
  • Store Device Tokens
  • Store Mlem Accounts
  • Associate Device Tokens with Lemmy Accounts to push account-relevant information to that device. This is a many-to-many mapping, since many accounts may exist on one device and an account may be signed in on many devices.
  • Associate Device Tokens with subscriptions to determine which information to push to the device
  • Associate Mlem Accounts with stored information
  • (optional, but valuable) Associate Mlem Accounts with Device Tokens and Lemmy Accounts. This would let us offer features such as quick setup on new devices, so a user can simply sign into their Mlem Account and the server would provide the list of Lemmy Accounts so the user doesn't need to add each one individually.

I would propose a database schema along the following lines:

DEVICE_TOKENS

  • TOKEN: PK; the device token

LEMMY_ACCOUNTS

  • ACCOUNT_ID: PK; actor id of the account
  • JWT: jwt for that account (encrypted, salted, etc.)

MLEM_ACCOUNTS

  • ID: PK; user-provided unique name
  • EMAIL: optional, needed for password reset etc.
  • PASSWORD_HASH: securely one-way hashed value of user password, used to back login
  • JWT: auth token
  • DISPLAY_NAME: optional, user-provided display name
  • etc. for account information

POST_SUBSCRIPTIONS

  • DEVICE_TOKEN -> FK into DEVICE_TOKENS
  • POST_ID: actor id of the subscribed post
  • LAST_STATE: stores, in some form, the last state of the post. Enables checking whether the post has changed.

COMMENT_SUBSCRIPTIONS

  • DEVICE_TOKEN -> FK into DEVICE_TOKENS
  • COMMENT_ID: actor id of the subscribed comment
  • LAST_STATE

INBOX_SUBSCRIPTIONS (this could be split further into message/reply/mod mail specific tables. Irrelevant once Lemmy implements push notifications; at that time, this can be replaced by relay server functionality.)

  • DEVICE_TOKEN -> FK into DEVICE_TOKENS
  • ACCOUNT_ID -> FK into LEMMY_ACCOUNTS
  • LAST_STATE

MODERATION_SUBSRIPTIONS

  • DEVICE_TOKEN -> FK into DEVICE_TOKENS
  • ACCOUNT_ID -> FK into LEMMY_ACCOUNTS
  • INSTANCE: instance to query for moderation info
  • LAST_STATE

OWNED_ACCOUNTS

  • MLEM_ACCOUNT -> FK into MLEM_ACCOUNTS
  • LEMMY_ACCOUNT -> FK into LEMMY_ACCOUNTS

OWNED_DEVICES

  • MLEM_ACCOUNT -> FK into MLEM_ACCOUNTS
  • DEVICE_TOKEN -> FK into DEVICE_TOKENS

To serve push notifications, the server runs a scheduled job:

  • For each unique subscription target in the relevant _SUBSCRIPTIONS table:
    • Fetch the state
    • If the state has changed, push a notification to all associated DEVICE_TOKENs

To support user-fetched data, the server just needs to expose a relevant endpoint. Account/data associations can be handled either with JSON CLOBs in a column of MLEM_ACCOUNTS or in their own tables that FK into MLEM_ACCOUNTS.

EricBAndrews avatar Apr 13 '25 01:04 EricBAndrews

This rests on the assumption that one Lemmy account will only be used by one Mlem user, which should hold for the majority of cases but is not universal enough to build an architecture around; shared social media accounts are unusual but not nonexistent.

Ah, I didn't think of that. That database schema looks good to me 👍

Sjmarf avatar Apr 13 '25 08:04 Sjmarf

We have our server set up. The recurring costs are:

  • Server hosting (Hetzner): $14.49/mo
  • DNS (Cloudflare): $10.11/yr

Our DB hosting is Supabase, which has a robust free plan that should cover our needs.

#2068 implements some of the easy backend wins; once it goes live we should have a good baseline for the server's production performance, though I'm confident that it's got enough horsepower to keep us going for a while.

@d42ohpaz

EricBAndrews avatar Jun 07 '25 19:06 EricBAndrews

@EricBAndrews Thank you for the update. :)

d42ohpaz avatar Jun 07 '25 22:06 d42ohpaz