crab.fit icon indicating copy to clipboard operation
crab.fit copied to clipboard

Dependency on Google

Open Amustache opened this issue 2 years ago • 4 comments

Would it be possible to have alternative to the Google Cloud Platform™?

Thank you for your time!

Amustache avatar Jul 01 '22 14:07 Amustache

+1 to this. I would love to self-host without having to rely on Google to use this.

kimimaru4000 avatar Aug 06 '22 20:08 kimimaru4000

Currently the backend is closely integrated with Google datastore, but I will look into abstracting that logic so it can be configured for any database.

I'll also look into providing some more advanced documentation for self-hosting Crab Fit after I make those changes. Thanks for the request.

GRA0007 avatar Aug 19 '22 08:08 GRA0007

++ would also love a version without google cloud

desinox avatar Aug 19 '22 17:08 desinox

Would it be possible to have alternative to the Google Cloud Platform™?

+1

I'll also look into providing some more advanced documentation for self-hosting

A Dockerfile (and ideally prebuilding images in CI) and docker-compose would be really handy!

mayel avatar Aug 20 '22 23:08 mayel

@GRA0007 Do you need any assistance with this? I would really love to self host Crab Fit and if there is a way I can help to make it possible, I'll be happy to provide assistance.

I've never worked with Crab Fit before though, so I am not sure if I'd be more of a help or a hindrance. 😅

TuringTux avatar Oct 12 '22 08:10 TuringTux

There are about 10 places in the code where this would need to happen. I don't see any particularly sophisticated queries, so pretty much any drop-in KVS or ORM should work. I'd recommend Sequelize for this project.

loleg avatar Oct 14 '22 18:10 loleg

We'd then probably also need to change some configuration for deployment on arbitrary URLs, e.g. the CORS headers: https://github.com/GRA0007/crab.fit/blob/main/crabfit-backend/index.js#L25

Sequelize does look incredibly cool.

Do I see this right that with Google datastore, the data types (e.g. something like Event) are never explicitly defined? So that would probably be something one had to model explicitly.

I just went ahead and forked this repo, maybe this will give me enough incentive to start developing.

TuringTux avatar Oct 18 '22 20:10 TuringTux

Updated (2022-10-19T17:56Z) with the latest state of development:

  • I believe the application has exactly three data types: Event, Person and Stats.
  • I've started creating an additional abstraction layer, so you can switch between datastore and Sequelize (I strongly assume that feature will be wished by the original developer).
  • The base classes of the abstraction layer are contained in a file model/base.js. They might come in handy to resolve issue #230 in the future.
  • model/drivers/ contains subclasses of those base classes that implement the specific logic to work with datastore, Sequelize (or any other "driver", as I'll call them for now).
  • I've also extracted all datastore calls into one single file: model/methods.js. The idea is to now move them step-by-step to the drivers/datastore.js file and, while doing so, define the API the abstraction layer needs to have (likely all CRUD options for all three types plus some queries).

If anyone wants to join the efforts, please feel free to message me, we can work on the fork together :)

TuringTux avatar Oct 19 '22 12:10 TuringTux

Or maybe scrap all the above: @GRA0007 just mentioned Keyv and it seems to be pretty much exactly what I was just about to (poorly) implement manually.

Update: Well, the commit history is already a massive mess (I suppose I will squash them all), but here is my latest state of migrating to keyv: https://github.com/TuringTux/crab.fit/tree/use-keyv

Second update: keyv is nice (I already managed to use it to store an Event in an SQLite database), but it does not have built-in support for querying (other than "get object by id"). This is probably not that surprising (it says it is a key-value store and it is exactly that), but there are a few places in the application where we need to e.g. find all people participating in a certain event. We could do this only by iterating over all people, and that will probably not scale.

So, maybe we do need the additional abstract layer after all (to switch between datastore, which supports queries, or an ORM, which also supports queries) or find some library that... well, essentially emulates a NoSQL database on top of an arbitrary "thing that stores data" (as e.g. datastore, or an RDBMS) and I would be genuinely surprised if that existed.

TuringTux avatar Oct 19 '22 18:10 TuringTux

@TuringTux something like https://www.npmjs.com/package/@nano-sql/core ? (this is not a recommendation, I haven't used it before)

mayel avatar Oct 19 '22 20:10 mayel

@mayel Yes, something like this. Consider me genuinely surprised 😀

For now, I've nevertheless decided to create an own abstraction layer first and change all occurrences in the code to use it, which would then look e.g. like this.

Basically, we'd then have Event.get or Event.create to get existing or create new events, respectively and event.save (to be called on an existing event instance) to update that event in the database. As soon as I've fleshed out that API, I will probably document it, clean up the commits etc.

Then, I'll start choosing a (or multiple) fitting drivers (so maybe nano-sql, or Sequelize, or firestore, or keyv, or ...) and start working on the concrete implementations.

Update: I've started cleaning up my commits to sort my thoughts:

I think I could now work by rebasing the abstraction layer branch on the one where all datastore calls are moved to model/methods.js, and then start replacing all calls there with calls to the abstraction layer and at the same time migrating the business logic.

I'm also slowly getting the feeling that writing some unit tests might be helpful...

TuringTux avatar Oct 20 '22 08:10 TuringTux

Do any of you know how to get the full key out of Datastore? My code basically looks like this:

const entityData = { 
  name: name.trim(), 
  password
  // ...
}

const entity = {
  key: datastore.key("Person"),
  data: entityData
}
await datastore.insert(entity)

// Update (2022-10-30): The following should work, according to https://cloud.google.com/datastore/docs/datastore-api-tutorial?hl=en:
const id = entity.key.id

I create an entity and insert it with an incomplete key, to let Datastore generate an id for me. How do I find out which id has been generated? Does insert return anything useful?

(I would test it, but I currently don't have access to a datastore instance)

Update (2022-10-30): Buried deep down in a tutorial, I found an example where the id property of the generated key is accessed after insert is called. So I suppose I'll go with this for now.

TuringTux avatar Oct 25 '22 19:10 TuringTux

Status report: A lot of functionality has been moved to drivers/datastore.js already. What still needs to be done:

  • There still is a findEvent method that finds an event with a given id using a query. I wonder if that could be replaced by using Event.get (which relies on datastore.get, instead of a query). However, maybe that is not possible (findEvent is currently only used when creating events, to check if an event with a certain id exists).
  • The abstractions should be tested by someone who has access to a Google Cloud Datastore. Ideally, everything still works exactly as before.

TuringTux avatar Oct 30 '22 16:10 TuringTux

Updated status report: While the above still holds true, it should now be possible to implement you own storage driver!

To do so, you have to create a new file in model/drivers implementing the BaseEvent, BasePerson and BaseStat classes from model/base.js.

Then, you can change which driver to use in model/index.js.

TuringTux avatar Oct 30 '22 19:10 TuringTux

Thanks everyone who's been following along with this, and thanks especially to @TuringTux who put in the effort to try and make it a reality ⭐

I've been working on rewriting the API in Rust (#257) and the frontend in Next.js (#259), which will bring the option to choose a custom database when self-deploying Crab Fit. Full documentation on spinning up your own instance is coming soon!

As for addressing this issue, I would like to move off Google Datastore, however finding an alternative database that is fast and costs a similar amount is tricky. Because of this, the production site at https://crab.fit will likely continue to use Google Datastore for the time being until I can find a suitable replacement.

GRA0007 avatar Jun 09 '23 14:06 GRA0007

Update: I've released an initial guide on self-hosting here: https://github.com/GRA0007/crab.fit/wiki/Self%E2%80%90hosting

Please let me know if it's missing any info!

GRA0007 avatar Jun 11 '23 07:06 GRA0007

The way you describe licensing at the top applies to AGPL rather than GPL, so you may want to consider relicensing as AGPL?

mayel avatar Jun 11 '23 08:06 mayel

@mayel thanks for pointing that out! I didn't realise the difference, but I'm honestly not too worried about that case, so I've edited the wiki page :)

GRA0007 avatar Jun 11 '23 15:06 GRA0007