Theseus icon indicating copy to clipboard operation
Theseus copied to clipboard

Using one toolset crate (theseus-builder) instead of multiple tool crates

Open NathanRoyer opened this issue 1 year ago • 7 comments

I began using theseus-builder in the original Makefile, in the "build" and "$(bootloader)" target, exploiting these stages:

  • directories
  • link-nanocore
  • serialize-nanocore-syms
  • relink-rlibs
  • copy-crate-objects
  • relink-objects
  • strip-objects
  • add-bootloader

The config file in cfg/builder.toml maps variables from the Makefile to configuration properties. To allow this to work, I had to prepend an "export" makefile keywork to some of its variables, so that the builder could access them.

I set the "-q" switch on each builder call, to make it silent.

I took care to leave the main cargo invocation as-is.

By the way, serialize_nanocore_syms depends on crate_metadata, mod_mgmt, memory, kernel_config which makes the builder compile pretty slowly, as these in turn have many dependencies. Could we create a "common-dependency" crate which would hold the definitions needed by both the builder and theseus in itself, to get rid of these four?

Should I replace other parts of the makefile? Is there anything you'd like me to do before removing previous tool crates?

NathanRoyer avatar Aug 23 '22 15:08 NathanRoyer

I cannot build your branch, errors below:

error: failed to get `crate_metadata` as a dependency of package `theseus-builder v0.1.0 (/tmp/Theseus_builder/tools/builder)`

Caused by:
  failed to load source for dependency `crate_metadata`

Caused by:
  Unable to update /tmp/Theseus_builder/tools/theseus/kernel/crate_metadata

Caused by:
  failed to read `/tmp/Theseus_builder/tools/theseus/kernel/crate_metadata/Cargo.toml`

Caused by:
  No such file or directory (os error 2)
make: *** [Makefile:259: /tmp/Theseus_builder/build/nano_core/nano_core-x86_64.bin] Error 101

I also have a sneaking suspicion that once this branch builds, it will be significantly slower than the current theseus_main. I see things like a trivially-parallel make loop being replaced with a call to cargo run, which will slow down that step from something like 200ms to a few seconds.

Another thing is that we can't just export all environment variables like that, it's dangerous and should generally be avoided since it makes it very difficult to track down problems with current/future programs that discreetly rely on env vars. Instead we should directly pass values to the tool invocation, ideally not using env vars, but if it's unavoidable, we should set those specific env vars directly just for that one command invocation.

I'll reserve my other comments for after this builds, but just one more thing -- I thought we were just combining all the build-related tools/ such that we don't have to pay the cost for compiling multiple separate tool crates? This is a much different PR changeset than what I was expecting after our conversation.

kevinaboos avatar Aug 23 '22 18:08 kevinaboos

I thought we were just combining all the build-related tools/ such that we don't have to pay the cost for compiling multiple separate tool crates?

Yep, that's exactly what this does :thinking:

Another thing is that we can't just export all environment variables like that, it's dangerous and should generally be avoided since it makes it very difficult to track down problems with current/future programs that discreetly rely on env vars. Instead we should directly pass values to the tool invocation, ideally not using env vars, but if it's unavoidable, we should set those specific env vars directly just for that one command invocation.

No problem, I can do it via command parameters, I'll do so if you prefer.

  • edit: done

I also have a sneaking suspicion that once this branch builds, it will be significantly slower than the current theseus_main. I see things like a trivially-parallel make loop being replaced with a call to cargo run, which will slow down that step from something like 200ms to a few seconds.

Ah, sorry I forgot to make the relink-rlibs stage parallel because I could never use it. I'll make it parallel.

  • edit: done

I cannot build your branch, errors below:

The fix for that depends on your answer to my "By the way" question, in the PR message.

  • edit: I pushed a temporary fix.

NathanRoyer avatar Aug 24 '22 07:08 NathanRoyer

Thanks, I'll try again.

The fix for that depends on your answer to my "By the way" question, in the PR message.

I think this is a great point, but I don't necessarily think the solution is to move them all into a separate crate; that won't improve the build time significantly. The solution there is to minimize the number of transitive dependencies that the various types needed for serialization have, such that we can avoid building tons of 3rd party crates. I will take a look at that now to see if it can be easily done.

kevinaboos avatar Aug 24 '22 16:08 kevinaboos

I thought we were just combining all the build-related tools/ such that we don't have to pay the cost for compiling multiple separate tool crates?

Yep, that's exactly what this does

Well, you also replaced some other steps in the Makefile, like the step that extracts and relinks multi-member .rlib archives. Also, on that note, the performance consideration there is that a shell invocation is going to be way way faster than building and running a Rust executable. My point here was that the goal of this PR was just to speed up the existing set of tools/ by combining them into one, not necessarily transforming every step in the Makefile into a separate Rust program. I think it makes sense to do that when the Makefile step itself becomes either very slow or very complex, but simple loops are probably best kept as shell commands in the Makefile.

kevinaboos avatar Aug 24 '22 17:08 kevinaboos

I just merged in #624, which significantly reduces the dependencies of tools/serialize_nano_core and thus reduces its build time. Thanks for pointing that out!

kevinaboos avatar Aug 24 '22 19:08 kevinaboos

[...] not necessarily transforming every step in the Makefile into a separate Rust program.

But theseus-builder is just one program, they're not separate programs? If you prefer, we can use cargo once, early, and then use the produced binary instead of a "cargo run" each time.

NathanRoyer avatar Aug 29 '22 07:08 NathanRoyer

[...] not necessarily transforming every step in the Makefile into a separate Rust program.

But theseus-builder is just one program, they're not separate programs? If you prefer, we can use cargo once, early, and then use the produced binary instead of a "cargo run" each time.

That shouldn't make any real difference in the build times, as the cargo build action will be cached after the first time and reused all future times.

What I was referring to is that in addition to the tool invocations, you're also migrating some non-tool Makefile actions to theseus_builder, for example the loop that unarchives and relinks .rlib files. No matter how you implement it, there's no way that building and running a separate Rust program for that is going to be faster than a parallelized loop of shell commands.

Thus, let's restrict this PR to changing only the invocations of tools/ programs (as previously agreed) and not changing any other parts of the Makefile, in order for us to fairly assess whether combining all the tools actually does improve build times.

kevinaboos avatar Aug 29 '22 17:08 kevinaboos