cacti icon indicating copy to clipboard operation
cacti copied to clipboard

perf(build): reduce redundant OpenAPI generator executions in codegen

Open petermetz opened this issue 1 year ago • 2 comments

Description

The codegen scripts should only call the openapi-generator-cli if the hash of the openapi.json has changed.

In theory the best way to achieve this is to save a hash of the openapi.json file with generated sources. For example in a manifest.json file such that ./src/main/typescript/generated/openapi/typescript-axios/manifest.json contains some useful metadata about the code generation process, like the hash of the openapi.json file and the openapi generator config, package.json, openapitools.json, etc. (so any file whose contents could modify what the generated sources look like)

An example of the manifest.json file could look like this:

{
  "inputFiles": [
    "packages/cactus-plugin-keychain-memory-wasm/package.json",
    "packages/cactus-plugin-keychain-memory-wasm/openapitools.json",
    "packages/cactus-plugin-keychain-memory-wasm/src/main/json/openapi.json"
  ],
  "inputFilesMd5": "1abcb33beeb811dca15f0ac3e47b88d9",
  "createdAt": "2024-07-18T02:29:48.058Z",
  "commitSha": "497ea3226631fdcad763e6281ee058d91ca01988"
}

Then before we run the code generation at build time, we examine if the hashes of the files listed in inputFiles is the same as in inputFilesMd5 and if yes, then we can skip the code generation because the generated sources would equal the current state anyway.

For the hashing, we can write a custom script in the ./tools/ directory which can be called by the npm scripts of the packages prior to them calling the generator to perform the examination of whether we need to re-run the codegen on that specific package or not. Another script also under the ./tools/ directory can be made that generates the manifest.json after each successful code generation process, once for EACH invocation of the code generator. Note that there are some packages which have multiple scripts invoking the code generator so it's important to examine those separately to not miss out on performance gains.

Another Idea: Batching

Look into doing the code generation with a single JVM process (might speed up the process by itself enough that we don't worry about execution times anymore)

https://openapi-generator.tech/docs/usage#batch

petermetz avatar Jul 11 '24 17:07 petermetz

On a very high level what this means is that the yarn codegen script would run much faster. Right now it takes several minutes to run, but with this optimization we could bring that down to about 30 seconds or a minute. In some cases even less. The idea is to avoid doing parts of the code generation when the generated could would be the exact same as it was before.

petermetz avatar Jan 08 '25 18:01 petermetz

Peter Somogyvari — Today at 10:07 AM https://github.com/hyperledger-cacti/cacti/issues/3403 @udhayakumari

"codegen": "run-s 'codegen:warmup-*' codegen:lerna codegen:cleanup", @udhayakumari "generate-sdk:typescript-axios": "openapi-generator-cli generate -i ./src/main/json/openapi.json -g typescript-axios @udhayakumari packages/cactus-core-api/src/main/typescript/generated/openapi/typescript-axios Peter Somogyvari — Today at 10:14 AM packages/cactus-core-api/src/main/typescript/generated/openapi/typescript-axios/api.ts packages/cactus-core-api/src/main/json/openapi.json

petermetz avatar Jan 08 '25 18:01 petermetz