optimism icon indicating copy to clipboard operation
optimism copied to clipboard

feat(ci): add nx to monorepo for incremental cachable builds

Open roninjin10 opened this issue 2 years ago • 17 comments

What is nx

  • Incrementally builds and caches packages. Similar to bazel. Extremely powerful monorepo tooling
  • nice cli interface that allows anybody to target any package whether it's golang or ts
  • optional vs code plugin
  • robust ecosystem of nx boilerplates you can install with cli and other tooling
  • builds tests and lints golang, solidity, and typescript
  • provides a nice remote cache to speed up your local environment
  • provides an opt in nx cloud cache to speed up ci and provide more caching when building off of develop
  • easy to configure in simple case
  • Has extremely robust featureset if you would need it
  • Very mature tool has been around for over half a decade
  • Gateway repo has been using this with no issues for a while now.

Implementation

  • nx.json provides a general config that all typescript packages inherit from
  • you can also do config.json to granularly configure an individual package

How nx works You have build targets and file groups. You build a dependency tree by declaring what files packages and targets you depend on. This is exactly like bazel but lighter weight. It analyzes this to build a dependency graph. Then based on files changed it will incrementally build cache and test anything that needs to be rebuilt.

How much faster does nx make the build Depends on which caches you bust. In general if we want this repo to be fast, we need 3 things

  1. Smart building. Don't rebuild stuff twice.
  2. Pnpm. Build node modules 2x to 3x faster
  3. Esbuild. Build typescript 5x faster

In the case of a commit changing golang, no typescript or solidity needs to be built. Never having to build the old contracts is a nice win in speed too.

Why am I doing this pr now

I actually did most of this months ago but circled back because I was curious how well nx works with golang.

Video of me demoing nx

roninjin10 avatar Jan 24 '23 13:01 roninjin10

⚠️ No Changeset found

Latest commit: 8b5928e85d5233ce022151f3b10babfb1e90a97d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Jan 24 '23 13:01 changeset-bot[bot]

I put do-not-merge on this because I have had 2 prs in past that have broken the build while passing CI so I don't trust CI

roninjin10 avatar Jan 24 '23 13:01 roninjin10

Codecov Report

Merging #4778 (8b5928e) into develop (ee255c3) will decrease coverage by 0.01%. The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4778      +/-   ##
===========================================
- Coverage    45.71%   45.71%   -0.01%     
===========================================
  Files          436      436              
  Lines        28139    28139              
  Branches       688      688              
===========================================
- Hits         12864    12863       -1     
+ Misses       14237    14236       -1     
- Partials      1038     1040       +2     
Flag Coverage Δ
bedrock-go-tests 44.97% <ø> (-0.01%) :arrow_down:
cannon-go-tests 61.71% <ø> (ø)
common-ts-tests 26.82% <ø> (ø)
contracts-bedrock-tests 59.25% <ø> (ø)
core-utils-tests 48.47% <ø> (ø)
fault-detector-tests 29.51% <ø> (ø)
sdk-tests 39.83% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 1 file with indirect coverage changes

codecov[bot] avatar Jan 24 '23 13:01 codecov[bot]

putting in draft until I can confirm this doesn't in fact break the build

roninjin10 avatar Jan 24 '23 15:01 roninjin10

I think this is Gtg though

roninjin10 avatar Jan 24 '23 15:01 roninjin10

I've finally found a moment to take a look at this. Overall I feel in favor of it.

Pros I see:

  1. There is no change required to the developer workflow. All existing yarn scripts continue to work.
  2. It is faster, but not crazy faster. Bigger difference on yarn than yarn build.
# on willc/nx (rebased on develop) after yarn clean
time yarn # 9.36s
time yarn build #  130.37s

# on develop after yarn clean
time yarn # 37.10s
time yarn build  # 178.94s

Cons I see:

  1. Introduces another config file for everyone to learn and maintain.

maurelian avatar Feb 01 '23 17:02 maurelian

I am in favor of this, is there anything we need to do to get this over the line besides testing that all commands work still?

tynes avatar Feb 01 '23 21:02 tynes

Nx looks great for JS/TS, but I am opposed to the changes to the Go modules:

  • We've never had any significant performance issue with building them
  • nx serve makes no sense when the Go programs require configuration with flags etc. & we have full automation in different places already
  • Wrapping Go tools with javascript tools is a big anti-pattern IMO

Can we exclude the bedrock Go modules from this change please?

protolambda avatar Feb 01 '23 23:02 protolambda

Sounds like golang already has incremental builds and when testing them they were really fast anyways from scratch. What would make sense would be to remove golang as suggested.

Eventually though, you would want to wrap golang with nx as one giant build target. Golang still handles the build but nx can still do the simple task of asking "did a golang package change or did a typescript package change"?

There are other alternatives to still set this up that are potentially bette than nx so doesn't make sense to do that here.

roninjin10 avatar Feb 02 '23 01:02 roninjin10

I am in favor of this, is there anything we need to do to get this over the line besides testing that all commands work still?

I'd assume that if CI passes, then we're good, and any edge cases can get cleaned up later.

maurelian avatar Feb 02 '23 01:02 maurelian

I am in favor of this, is there anything we need to do to get this over the line besides testing that all commands work still?

Nothing really. I think just this:

  1. Remove the golang completely
  2. Do extra work to make sure we don't break ci build
  3. Remove the remote cache key for now. It's easy to add back now but better to incrementally slowly adopt a cache. This remote cache can really speed up ci later though especially for golang changes that don't touch typescript as well as typescript changes. Also would speed up a cold start pulling develop for anybody
  4. Consider doing a follow up pr to nuke lerna completely for nx

roninjin10 avatar Feb 02 '23 01:02 roninjin10

It is faster, but not crazy faster. Bigger difference on yarn than yarn build.

on willc/nx (rebased on develop) after yarn clean

time yarn # 9.36s time yarn build # 130.37s

on develop after yarn clean

time yarn # 37.10 time yarn build # 178.94s

Worth noting if we had develop branch pushing to the remote cache your yarn build would have been a few seconds just the time to download the build artifacts

roninjin10 avatar Feb 02 '23 01:02 roninjin10

A rebase will fix op-heartbeat tests

tynes avatar Feb 02 '23 17:02 tynes

Hey @roninjin10! This PR has merge conflicts. Please fix them before continuing review.

mergify[bot] avatar Feb 03 '23 17:02 mergify[bot]

I think it's ready going to put it in review monday but still being very careful.

The biggest issue I see is the cache gets busted sometimes when it should not be busted. This causes no real harm though other than the cache not being as smart as it could be and we can tweak that.

roninjin10 avatar Feb 03 '23 23:02 roninjin10

tests still failing

roninjin10 avatar Feb 06 '23 01:02 roninjin10

Hey @roninjin10! This PR has merge conflicts. Please fix them before continuing review.

mergify[bot] avatar Feb 14 '23 22:02 mergify[bot]

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Mar 01 '23 02:03 github-actions[bot]

Hey @roninjin10! This PR has merge conflicts. Please fix them before continuing review.

mergify[bot] avatar Mar 26 '23 17:03 mergify[bot]

Deploy Preview for opstack-docs canceled.

Name Link
Latest commit 8b5928e85d5233ce022151f3b10babfb1e90a97d
Latest deploy log https://app.netlify.com/sites/opstack-docs/deploys/6493aca0357acb0008956918

netlify[bot] avatar Mar 26 '23 18:03 netlify[bot]

If we merge this and we also merge https://github.com/ethereum-optimism/optimism/pull/5260

  • Any changes to golang code won't have to wait for ts to build ever
  • Any changes to js code will be significantly faster to build
  • If we turn on the cache even faster
  • Issues with falsely caching test results or builds that I have seen in past will go away. No longer need to manually set check_changed in circle ci in that brittle way

tldr I often hear client team complaining about slow ci and stuff and these 2 prs are the solution

roninjin10 avatar Mar 26 '23 18:03 roninjin10

Also if we turned on remote cache (trivial to do) everytime you pulled develop branch you wouldn't need to rebuild anything it woudl just pull the builds from the cache.

roninjin10 avatar Mar 26 '23 18:03 roninjin10

Hey @roninjin10! This PR has merge conflicts. Please fix them before continuing review.

mergify[bot] avatar Mar 30 '23 14:03 mergify[bot]

reopening post bedrock

roninjin10 avatar Apr 11 '23 02:04 roninjin10

Not a priority but going to reopen this and try to get it in in next few weeks

roninjin10 avatar Jun 07 '23 23:06 roninjin10

Hey @roninjin10! This PR has merge conflicts. Please fix them before continuing review.

mergify[bot] avatar Jun 08 '23 17:06 mergify[bot]

Now is a good time to get this merged @roninjin10

tynes avatar Jun 12 '23 23:06 tynes

@roninjin10 I've rebased this PR, my one question is why we have the cloud package installed + it configured, its going to be a noop?

tynes avatar Jun 22 '23 00:06 tynes

ok i broke something

tynes avatar Jun 22 '23 01:06 tynes

@tynes ideally we should start using it as it will make our builds way way way way way faster. There is a default runner we could use instead. I wonder if there is a nx cache that we could self host?

roninjin10 avatar Jun 22 '23 18:06 roninjin10