Monorepo icon indicating copy to clipboard operation
Monorepo copied to clipboard

Various small DX improvements

Open SachaG opened this issue 2 years ago • 3 comments

As I'm working on the codebase, a couple notes of stuff we could improve to discuss later.

  • [ ] We should get rid of all the .eslintrc, .prettierrc, etc. files to ensure every project follows the same formatting guidelines; and ideally also find a way to make sure every subdirectory uses a similar tsconfig.json
  • [ ] Except for really simple or single-use helper functions, I prefer using a singleton argument for functions (const f = ({ arg1, arg2 }) => ... instead of const f = (arg1, arg2) => ... because it avoids any risk of messing up the order of the arguments. Maybe we can start applying that pattern throughout the whole monorepo?
  • [ ] We need to make sure we use the same types throughout the monorepo. We have a few duplicates for the same kind of objects currently.
  • [x] I still haven't managed to import shared runtime code in the api2 project.
  • [ ] We should decide once and for all where each app loads its data from. For me the source of truth is the GraphQL API, or if that's too slow the same API data cached in Redis. Only the API should talk to GitHub or load files from the filesystem.
  • [ ] Still not super happy with how env variables are managed, especially since all the projects share the same ones. But I don't have a great solution…
  • [ ] We need to make sure that /surveyform uses surveyId and editionId when saving new responses.
  • [ ] Current responses document follow a js2021__features__nullish_coalescing__experience naming pattern due to the fact that we only had a single schema for all surveys, and wanted to avoid naming collisions. Could we drop this requirement and just have e.g. nullish_coalescing__experience now? (at least we should get rid of the __experience suffix which should be replaced by looking up the proper template object corresponding to the question)
  • [ ] we can probably simplify surveyadmin a lot since we don't really need any of the Vulcan models/permissions/forms/etc. It would work better as a "normal" Next.js/Node app I think, closer to api but with the admin GUI part added.

SachaG avatar Apr 13 '23 01:04 SachaG

Quick feedback:

  • ok for eslint feel free to drop whatever configs are irrelevant, I am also progressively deleting stuff from Vulcan (babel configs, unused tests, etc.)
  • ok for functions, I do that usually already, TS helps a lot anyway
  • for loading data, currently the surveyform is autonomous, which leads to more freedom in the data structures but also redundancies. Regarding perfs, there are 2 scenarios: loading data during build-time, and loading data per-request.

During build, performances are not a big deal, however I cannot yet load everything statically during build because Next doesn't support static layout (eg loading the survey definition) mixed with dynamic pages (eg loading the user response for the survey).

So, on some pages we may end up loading the survey on every request. I am not fond of using GraphQL because it's slower, and we do not benefit from GraphQL in this use case, since we have roughly 3 types of query (getting the survey definition, getting the question, and getting the list of surveys) that barely ever change. Calling Redis seems more efficient and simpler to me. I also use a bit of short-lived in-memory caching (eg a few minutes) to reduce the workload, since all users will load the same survey definitions there is no need to reload on each request.

I am not yet sure if we want to put this logic in API, maybe just in shared code instead? I think the API is more relevant for the result app which is highly dynamic than the surveyform we is simpler and can be optimized a bit more "aggressively" and thus need to stay independant.

  • for surveyId and editionId it should be done in surveyform, though there might be some old code left in some places. I haven't updated surveyadmin yet though

  • yes for response, I am currently working on the API endpoints to use Next 13.3 route handlers, so we can stop using GraphQL. For this to work, we must make sure that all responses have an editionId. We might need to migrate older responses to have this field too.

eric-burel avatar Apr 19 '23 10:04 eric-burel

So, on some pages we may end up loading the survey on every request. I am not fond of using GraphQL because it's slower, and we do not benefit from GraphQL in this use case, since we have roughly 3 types of query (getting the survey definition, getting the question, and getting the list of surveys) that barely ever change. Calling Redis seems more efficient and simpler to me.

Yes I agree, I just meant we should consider the API as the source of truth, but we can access a cached copy of its data, that's fine.

I also use a bit of short-lived in-memory caching (eg a few minutes) to reduce the workload, since all users will load the same survey definitions there is no need to reload on each request.

If that's possible then yes I think in-memory caching (nodeCache or similar) is the best solution. But I thought it wouldn't work for surveyform because of the serverless architecture?

SachaG avatar Apr 20 '23 00:04 SachaG

For the serverless architecture, that's a very good question and I was surprised by the answer:

  • in a serverless architecture in general, you will get at most 1 instance of node-cache per instance of each function. So if we have 10 pages, 2 instances for each, you have 20 node cache instances. But if each lambda serves 10k users, you will have only 20 queries instead of 200k for instance so still worth it.
  • it seems that Vercel will actually rebundle functions into a small number of lambdas, usually 1 for the API and 1 for the web pages (to my surprise). So you will get only 1 or 2 instances of node cache. Doc: https://vercel.com/docs/concepts/functions/serverless-functions#bundling-serverless-functions And I've asked more info on this thread: https://github.com/vercel/vercel/discussions/5458

You still have to make the code serverless-friendly, for instance each page should connect to the database independently from the others, because you can't tell which is requested first or which will actually get bundled. But in the end you will not have that many instances so caching works ok.

eric-burel avatar Apr 20 '23 20:04 eric-burel