nearcore icon indicating copy to clipboard operation
nearcore copied to clipboard

[Epic] Restructure nearcore project

Open frol opened this issue 4 years ago • 22 comments

My developer experience with nearcore is not great, so I contemplated the improvements that can be made.

The very first thing that is missing for me is a clear project structure. I believe that the best code is no code and the same goes for documentation (this is why I don't want to introduce documentation of the project structure, but rather prefer having an obvious project structure that does not require a lot of time to understand and makes it easy to navigate without a need to memorize all the components).

I don't have enough experience with the existing infrastructure, but here is my first take on it.

The root folder:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

neard/
 libraries/
 tools/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 README.md
 rustfmt.toml
 rust-toolchain

It is still crowded with various config files, but at least the folders have a clear structure. Just for the reference, here is the current project structure:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

async-utils/
 chain/
 conf/ # Would you guess that this is Grafana config? Do we actually need this config in this repository?
 core/
 docs/ # Nice! We have some documentation! Well, there is a single logo.svg file (let's move it to .github/).
 genesis-tools/
 near/
 nightly/ # Nightly what?
 pytest/ # This is a very popular testing framework in Python, right? Nope
 runtime/
 scripts/
 tests/ # Can we move them into `near` <span class="error">[`neard`]</span>?
 test-utils/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 package-lock.json # ???!
 README.md
 rustfmt.toml
 rust-toolchain

So, I suggest collapsing those 13 folders into 3 top-level ones:

 neard/
 src/
 tests/
 tests_py/ # rename from /pytest (I have no idea where to put the nightly folder with those two .txt files, yet)
 libraries/
 chain/
 core/
 runtime/
 test-utils/
 utils/
 tools/
 near-loadtester
 near-state-viewer
 near-genesis
 near-deploy # rename from /scripts

The exact naming is yet to be discussed.

The crates structure and naming also needs some love.

frol avatar Dec 19 '19 12:12 frol

This is really great that someone is looking into this and raising these concerns, I really liked the comments.

pytest are not just stress tests they are very similar to nearcore/near/tests and nearcore/tests, just written in Python.

So I suggest we do the following renaming:

  • nearcore/pytest -> nearcore/neard/tests_py;
  • nearcore/nightly -> nearcore/neard/tests_nightly_py;
  • nearcore/tests -> nearcore/neard/tests_rs;
  • nearcore/near/tests -> nearcore/neard/tests_rs.

WDYT?

MaksymZavershynskyi avatar Dec 19 '19 20:12 MaksymZavershynskyi

So I suggest we do the following renaming

The suggested renaming sounds reasonable to me. I will apply the suggestion (update my post) tomorrow.

frol avatar Dec 19 '19 23:12 frol

I agree with @frol and @nearmax comment above.

nearcore/nightly -> nearcore/neard/tests_nightly_py;

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

ailisp avatar Dec 19 '19 23:12 ailisp

also note that nightly heavily relies on the relative paths, so if we change the dirs, nightly needs to be changed accordingly.

SkidanovAlex avatar Dec 19 '19 23:12 SkidanovAlex

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

Let's jut put in into tools directory.

MaksymZavershynskyi avatar Dec 20 '19 00:12 MaksymZavershynskyi

Assigning it to @frol so that we have only one assignee as per guidelines.

MaksymZavershynskyi avatar Feb 20 '20 20:02 MaksymZavershynskyi

The first steps in this direction are #2511 and #2520.

@SkidanovAlex @nearmax @ilblackdragon Do you have any objections against project refactoring? The code is not going to change, it should be just more clear folder naming towards better navigation around the project. I want to make the following steps (separate PR iterations):

  1. Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.
  2. Move some of the folders to clean up the root of the project and also make it clear what our internal libraries/modules/components are (see the issue description).

My motivation mentioned in the issue description only grows with the time I spend with nearcore. I don't want to have lengthy conversations about this, but it hurts my productivity. It is somewhat related to #2433.

frol avatar Apr 24 '20 18:04 frol

I agree, this is the kinda of stuff that slows down everyone and makes our code less appealing to external contributors. We discussed with @fckt that having a folder named "libraries" is kinda weird, given most of our crates are libraries, but on the second looks it looks ok.

MaksymZavershynskyi avatar Apr 27 '20 00:04 MaksymZavershynskyi

Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.

I'd suggest the following adjustment. We can have a near folder to hold all the near-* components.

neard
near
  /chain
  /primitives
  /store
  /runtime
  /json-rpc
utils
test-utils
tools
docs
scripts
...

lexfrl avatar Apr 29 '20 00:04 lexfrl

@fckt Given the project is nearcore, it implies that everything in the repository is related to near. Thus, "near" as a folder name does not introduce any clarity (the same way I don't really like chain/chain and runtime/runtime). It has its benefit of pulling some of the root folders inside, but utils and test-utils left behind.

frol avatar Apr 29 '20 02:04 frol

The idea is to associate crates which are the core components (direct dependencies) of neard process and their dependencies.

Given the project is nearcore, it implies that everything in the repository is related to near.

Right, but not all of crates are the direct dependencies of the neard (or dependencies of these crates), there is a lot of supplementary (or optional) crates. There is a category of crates which are essential for the node operation.

The following crates make up the system and these can be naturally form a list of the core components (approximated):

near-config
near-crypto
near-primitives
near-store
near-runtime
near-chain
near-chunks
near-client
near-pool
near-network
near-jsonrpc
near-telemetry
near-epoch-manager

However, don't really think that this list as an accurate and complete. For example, its questionable if near-jsonrpc or near-telemetry are the core components (since these could be optional for neard - I can imagine that node operators would want to disable it). But the most probably the near-chain, near-store, near-network, near-runtime and near-primitives are the essential ones (not optional in any mode). Note: I don't count artificial execution modes like testing environments.

lexfrl avatar Apr 29 '20 12:04 lexfrl

Let me list all the crates we have:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
neard
near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
restaked
runtime-params-estimator
state-viewer
testlib

Let's extract tools:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
restaked
runtime-params-estimator
state-viewer

Let's move neard out of scope and list the rest:

near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
testlib

Everyone is welcome to suggest a clear separation of the packages. Currently, we have:

  • core (I think "common" would be a better name describing that these are widely reusable components)
  • chain
  • runtime
  • test-utils (I would move testlib and near-logger-utils to "common" and move the rest to "tools")
  • utils (I hope to merge it somewhere)

I find this separation quite reasonable, but it is hard to find a place for things like near-jsonrpc, near-jsonrpc-client, and [hopefully extracted one day] neard-lib.

frol avatar Apr 29 '20 19:04 frol

Where does the neard go under this separation @frol ? Also I think we can merge utils and test-utils (those that are not test utils in utils should probably go to tools).

bowenwang1996 avatar Apr 29 '20 23:04 bowenwang1996

what is difference between util and tool?

On Wed, Apr 29, 2020 at 4:56 PM Bowen Wang [email protected] wrote:

Where does the neard go under this separation @frol https://github.com/frol ? Also I think we can merge utils and test-utils (those that are not test utils in utils should probably go to tools).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nearprotocol/nearcore/issues/1881#issuecomment-621530534, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFFFCD2TWHOBH4WXHS2PQLRPC5CRANCNFSM4J5FE2PQ .

ailisp avatar Apr 29 '20 23:04 ailisp

I propose keeping neard in the root folder since it is the main entry point to the project.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

I think we can get rid of both "utils" and "test-utils" (just move the relevant crates to "common" and "tools").

what is difference between util and tool?

I view them as follows: a tool is something that you use as a standalone piece (i.e. executable), while util is anything that is useful (e.g. a library, a file, a script, and even an executable).

frol avatar Apr 30 '20 00:04 frol

As a general methodology, I think if we want to define a category it should be valuable and not tautological. In other words the definition should add a value, be unique and be not ambiguous.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

neard-lib - makes sense to me! Why it makes sense: besides of the neard itself, we have a lot of other entry points (tests) which would want to reuse some process initialization logic.

lexfrl avatar Apr 30 '20 11:04 lexfrl

The "utils" to me is a common code which exists because you don't want to repeat it (and this is a dominating characteristic of it). But in the same time, it's a small enough and don't deserve to evolve into a separate library (yet). In the same time, that code should be not exclusively specific to the project (e.g. it must not use custom types, defined in the project). For the exclusive common code and types of the components, primitives serves good.

Essential logic which covers dedicated pieces of the project spec (and at the same time are not useful separately (excluding artificial purposes like testing)), better to be colocated together. \

lexfrl avatar Apr 30 '20 12:04 lexfrl

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 01 '21 10:07 stale[bot]

@frol @matklad do we still want to proceed with this?

bowenwang1996 avatar Jul 02 '21 16:07 bowenwang1996

Yes, I believe there's still room for improvement for project structure, and that this is non-urgent, but important. @miraclx recently make great progress here as well! Let's keep this open.

matklad avatar Jul 06 '21 17:07 matklad

Yeah, we slowly move in this direction:

  • #4292
  • #4621
  • #4651

frol avatar Aug 09 '21 23:08 frol

One suggestion I have is to flatten out the directory layout and keep all rust crates in the one-level deep crates/ directory. In other words, the first list from https://github.com/near/nearcore/issues/1881#issuecomment-621414303 looks pretty reasonable to me, and it nicely solves the "where do I put a thing which doesn't belong anywhere else" problem.

See https://matklad.github.io/2021/08/22/large-rust-workspaces.html for an extended motivation.

(0.8 confidence the end state will be much better, 0.95 confidence that its a big and a somewhat painful change, 0.7 confidence that we should actually do that)

matklad avatar Aug 24 '21 11:08 matklad