polis
polis copied to clipboard
WIP: java/typescript node convict configuration integration
- Currently, configuration variables live in many different files.
- This pull request moves default configuration variables and documentation into a single
schema.yaml
file. - These values can be overridden by editing
development.yaml
,production.yaml
, ortest.yaml
as appropriate.- We can discuss the pros and cons of whether and the configuration settings should be coupled to the NODE_ENV variable.
node-convict allows configuration variables to be set 1) in a schema, 2) in a file, 3) from an environment variable, or 4) from a command line argument. 4) has the highest precedence and 1) has the lowest precedence.
There are a lot of edits in this pull request, but they fall into several categories:
- Adding the new
convict
coding and schema into theconfig
directory. - Modifications to
dockerfiles
,.github/workflows
andpackage.json
to supportconvict
:- Removing
cp polis.config.template.js polis.config.js
from.github/workflows/bundlewatch.yml
- Adding
COPY --from=polis_config /config/config.js /app/config/
toclient-admin/Dockerfile
- Adding
"convict": "6.0.0"
toclient-admin/package.json
- Removing
- Importing the
convict
config.js
code where needed:- For example in
client-admin/dev-server.js
:
- For example in
let POLIS_ROOT = process.env.POLIS_ROOT
var config = require(POLIS_ROOT + 'config/config.js');
- Replacing
process.env
andpolisConfig
code:- For example
html = html.replace("<%= fbAppId %>", config.get('fb_app_id'));
inclient-admin/dev-server.js
.
- For example
- Modifications to
.github/workflows
to allow efficient testing:- Adding
workflow-dispatch
triggers.
- Adding
@metasoarous and @patcon, (cc @tamas-soos-toptal),
We think that this pull request is ready for review and merge. As I wrote above, the conflict in client-report/package.json is due to the following change:
"knox": "github:caremerge/knox.git#b1e031c209b3c17cab622a458e87728c1ee88cbd",
to
"knox": "https://github.com/caremerge/knox.git#b1e031c209b3c17cab622a458e87728c1ee88cbd",
We found this to be necessary to avoid an error in testing. It was the recommended solution to avoid a security/spoofing problem.
Changing only the URL's for knox from the dev branch passes all ci tests: https://github.com/crkrenn/polis/actions?query=branch%3Aknox_https_update.
And, do you have a recommended way to update the package-lock.json
file from the package.json
file? When I ran npm i --package-lock-only
, the package-lock.json
file doubled in size.
Please let us know if you have any questions.
Thanks very much!
-Chris
Hi Chris,
I was planning to look into auto scaling container setups soon.
I've looked into both Amazon aws and google gcs. I'm leaning towards aws because it supports serverless Postgres which is attractive for very small and cheap deployments.
I was also thinking of using terraform to minimize the danger of lock-in. https://registry.terraform.io/browse/providers.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html
Could we schedule a voice or video call to discuss?
Chris
@metasoarous (cc @tamas-soos-toptal )
Tamas and I are currently planning to pause this pull request and to start work on AWS deployment (that will take advantage of the convict configuration).
By providing a production ready solution that could replace heroku, we hope it would lower the barrier for merging convict into dev.
Our long-term goals are:
- to eventually deploy polis on aws fargate (serverless container service)
- to use node.js, docker, terraform, and some kind of build tool (make, etc.)
- There are several infrastructure as code solutions. I believe that Terraform is a good solution because it supports multiple cloud platforms (AWS, GCP, Azure, etc.), it seems well documented, and it seems to have decent training resources
Our short-term goals, to get spun up on deploying node.js apps with fargate, docker, and terraform are to complete the following:
- hello world app on aws fargate with docker
- hello world on aws fargate with docker and terraform
- hello world with > 1 tasks and a load balancer
- simple app with two communicating containers and load balancers
- One container "hello var"
- Other container return random var (Chris, Chris, Tamas, Patrick)
After that, we hope to move onto deploying polis on fargate.
Feedback and suggestions would be very welcome!
-Chris
Hey @crkrenn. Thanks for the update.
It's great to hear that you're keen to work on scaling infrastructure. That having been said, I think it would be great if we could get this configuration work merged in first, before the PR diverges significantly from the main branch.
The main obstacle I see to that at the moment (again) is this issue of orchestrating dependencies between the separate configuration container. I just submitted a ticket to the folks at Heroku to see if they can help us out with this. It's possible that if we aren't able to use their services to actually [build all the images], that we can get around this by building images locally, and then pushing to their container registry, but this moves us away from a single command deploy flow, and introduces complications with respect to a clean release/rollback cycle. It may be worth biting the bullet on this though, if it gets us a more sane configuration story, and we may be able to compensate via some thoughtful GitHub Actions & CI. In any case, I'll report back on what I hear from Heroku support.
As for how we should approach scaling infrastructure itself, a few thoughts come to mind:
- I certainly approve of trying to stay platform agnostic where possible, and would approve of something like terraform (or palumi). But to my knowledge, while these tools support different platforms, the code you write with them ends up having to target one platform or another; It's pretty hard in general abstract over platform design, and I'd imagine attempts at doing so would not be without their leaks. Which ultimately is not a huge deal; If we have duplicate infrastructure specifications for different target platforms, that's fine, as it still gives people options, and there's still some advantage to having those specifications in a shared language so that pieces can be reused where appropriate. I just want to point this out for context so that we don't have false expectations (and so that you can disagree with me if you have a different sense of the situation).
- I would love to be able to support a fully serverless infrastructure, but:
- From my experience working with serverless offerings on AWS, it is unlikely to be straightforward to reverse engineer a system as mature and involved as ours into a serverless architecture. You generally have to design with it in mind.
- In particular, there are particular constraints around the math worker which may be hard to shoe horn (performantly & consistently) into a serverless setup, even if it seems like a perfect fit at first. In particular, you have to ensure, as votes and moderation data piles up while a conversation update is running, that you don't start running new conversation updates before the first one has completed, or you could end up doing a lot more work than necessary at best, or inconsistent results at worst. There may be ways around this with thoughtful queuing, but having thought about this a bit in the past, I'm not so sure there's an easy solution.
- If we did manage to serverless-ify even just parts of the system, I'd want to make sure that we still support running on more traditional infrastructure, as this may important for certain applications, where being able to run on-prem with full data sovereignty is important. (It's also potentially a sticking point for some developers who might want to contribute). So we'd like still have to maintain a lot of the non-serverless code anyway. This might be worth it if it means that those of us who don't have such constraints can cut down significantly on infrastructure costs, but we shouldn't underestimate the person-hour costs of maintaining more complex code and infrastructure as a result of trying to support both paradigms.
All that having been said, I'm not apposed to you investigating a serverless setup further, and I'd be happy to be proven wrong (or assuaged) on any or all of these points! But those are my concerns.
In any case, let's continue to discuss.
Thanks again!
Closing in favor of #1617.
Thanks for your work on this!