weblate icon indicating copy to clipboard operation
weblate copied to clipboard

Add Heroku support

Open nijel opened this issue 10 years ago • 18 comments

It should not be hard to implement automated deployment to Heroku, somebody has even started that: https://github.com/blockgiven/weblate/commit/e155371a932a09c4ffca91d798a5d232866b7b75

nijel avatar Apr 27 '15 13:04 nijel

I've just tried running it on Heroku, but found it impossible to do so because weblate expects a persistent filesystem (data/vcs/{project} for instance), and Heroku doesn't provide that. It runs, but every time you deploy a new version the data folder is erased.

ravishi avatar Oct 07 '15 15:10 ravishi

Isn't there option for persistent file storage? Indeed Weblate will not work without it (the data/ directory is used for storing repositories, fulltext search index and some other stuff).

nijel avatar Oct 07 '15 18:10 nijel

Yeah, have just figured that out. I'm sorry for the noise.

ravishi avatar Oct 07 '15 19:10 ravishi

Can you please share something from your testing? Either as documentation or as code?

nijel avatar Oct 08 '15 07:10 nijel

Of course! I'm currently running it on Dokku[1], which is pretty much like Heroku (it uses Heroku buildpacks, at least), so some of the findings can be useful.

There is a branch at labhackercd/weblate/tree/dokku.

The installation is pretty much like running any other Django project on Heroku, with only three issues so far:

  • First, I had to mount a volume at /app/data. It's probably unecessary to mount the whole folder, but I didn't had the time to identify exactly what needs to be persistent and what doesn't.
  • I had issues with the whole boolean sum on postgres thing. I've reported it at #869. (Which seems to have been fixed while I was writing this, lol!)
  • Also, you must define DJANGO_SETTINGS_MODULE before even pushing to Heroku. There are some issues that arise when you import some stuff from Django from inside your module's __init__.py. I don't know exactly where is the problem, so we usually either avoid putting code at __init__.py or set the DJANGO_SETTINGS_MODULE environment var.

I'll give our setup a little more time to detect any other issues, since I'm not that used to Heroku yet. If anything pops up I'll report here.

[1] http://progrium.viewdocs.io/dokku/

ravishi avatar Oct 08 '15 12:10 ravishi

We have tried to run Weblate in Heroku too, but have the same problem (need persistent filesystem). We had thinking about it and we found some ideas for this but, after read the previous messages in which nijel says that the data folder is used for fulltext searches too, only one has (maybe) a bit of sense.

  • Change the way of managing the storage in weblate, to make the "data" folder independent of the filesystem so that we can store it in other machine (for example Amazon AWS) and access data as if it would be a local folder.

In fact this idea comes from a workmate that knows django better than me so I'll investigate it to give more information about it if anyone think that this has sense. How I said, I'm not an expert in django and I haven't time enough to become me an "expert" of django and then understand how is weblate internally (at least in a short period of time).

I hope this helps. We actually are using oneskyapp to manage our translations but for me Weblate is better and I want to use Weblate again (just needs to work on Heroku...)

(One of the older ideas was try to liberate weblate of the internal repositories but it no longer makes sense because now I know that the "data" folder has more functions.)

acamara7es avatar Oct 20 '16 16:10 acamara7es

I'm missing a bit what needs to be changed. The major thing data is used for is storing VCS repositories, what is pretty much essential feature on which Weblate is built (there is currently no other way to get translations out of it), so this really has to be filesystem. There is fulltext data as well, but that is not that important (and we need to look for another implementation anyway, see #800).

nijel avatar Oct 20 '16 17:10 nijel

Having dependencies on the local file system makes it difficult to:

  • making backups, database backups are easy enough to some extent, but doing file storage backup not that fancy.
  • Deployment with Docker makes the backup much more painful and confusing.
  • Read/Write permission error when deploying (right now I do have permission error for data_dir while being on docker-compose.)

Since data_dir is required for VCS files, then there are 2 possible solution to fix it.

  1. Use FUSE mount to attach external/cloud storages such as s3 or ceph, etc:
  2. Compress the VCS directory into a single file and upload to S3 via django filefield model backed by django-storages.

How option 1 would work

You need to build the S3 Fuse, by a custom buildack and connect the bucket to your app as DATA_DIR. Heroku allows pushing docker images as well, so if you don't want to do buildpacks, you can build the docker image with S3 Fuse support and deploy to Heroku.

How option 2 would work:

Every time the VCS changes, compress and save it to S3 if there's a high load, it can be done each hour or each day, etc.

Whenever app starts, if data_dir is empty the compressed VCS files get downloaded and extracted into data_dir.

However, such solution will not be that reliable because still a real file system needs to be present and and always uploading and downloading the VCS from external storage would be a pain and hacky.

Should keep in mind, that heroku apps also have a size limit and having such stuff on heroku can exceed the storage space as well.

Overall, not a good idea.

Alir3z4 avatar Mar 30 '19 18:03 Alir3z4

@Alir3z4 For https://devcenter.heroku.com/changelog-items/1145 or otherwise? Is it a better idea to use https://www.heroku.com/postgres in a different deployment or not? Not every use-case is the heaviest one.

comradekingu avatar Apr 13 '20 19:04 comradekingu

@comradekingu slug size is not the main issue, the fact that on each deployment the heroku dyno/instance will be completely wiped out, that's why using FUSE to S3 will be better option here, since the data are no longer on the dyno file system but kept on s3 and regardless of how many deployments/restarted/crash is applied, the files will be kept untouched.

Alir3z4 avatar Apr 13 '20 19:04 Alir3z4

fwiw I've had some success getting an alternative open source "continuous localization" project called https://mojito.global working on Heroku: https://github.com/patcon/polis-translations

It's developed by a coder on staff at Pinterest (formerly Box) and stores all data in MySQL (though other datastores should be possible)

patcon avatar Jun 19 '20 19:06 patcon

I would really like to see Weblate become available with Heroku. If it would be possible to move the persistent storage to postgres I think it would be a great solution.

tvb avatar Oct 14 '20 05:10 tvb

I don't think PostgreSQL is solution for all our needs - fonts and images is not really the kind of content you would like to store in the PostgreSQL. The same probably applies for Git repositories.

nijel avatar Oct 14 '20 08:10 nijel

Well, fonts and images could be easily hosted on a CDN/Object Store right? Git is something else, hmm.

tvb avatar Oct 14 '20 08:10 tvb

Yes, I'm just saying that PostgreSQL will not address everything and this is not a simple task.

nijel avatar Oct 14 '20 08:10 nijel

S3 should work as a repository filesystem . All other persistent files could be managed by s3, but you should rewrite everything related with file access (to be pluggable) with s3boto storages or simmilar.

mpachas avatar Jan 18 '23 15:01 mpachas

S3 remote might address sharing the Git repositories data, thanks for sharing the link.

you should rewrite everything related with file access (to be pluggable) with s3boto storages or simmilar

The most problematic will be probably pango/cairo/fontconfig stack used for rendering images with text. I really don't see an alternative here, so the solution might be to sync this from S3 before calling these.

nijel avatar Jan 18 '23 16:01 nijel

Let's keep this issue focused on the actual Heroku integration – there is another issue for the storage: https://github.com/WeblateOrg/weblate/issues/2984. I've just posted there an up-to-date summary.

nijel avatar Jan 19 '23 13:01 nijel