Support parallel compilation of all input files
Input files can be compiled in parallel.
- Implement this for
solang compile - Occurances of
parallel solang compilein our CI jobs are no longer needed
I think there are few things that are needed.
- The FileResolver should be wrapped in a
std::sync::RwLock - Optionally the parse tree should be cached in the FileResolver, so we don't waste cycle reparsing the same file (e.g. files that are imported)
- Files should be processed in a thread worker pool fashion
I think this issue is a bit more difficult than it looks like. If one file depends on the other, they cannot be built in parallel, due to dependency resolution in sema. At least, the parser and the lever can run in parallel.
I think this issue is a bit more difficult than it looks like. If one file depends on the other, they cannot be built in parallel, due to dependency resolution in sema. At least, the parser and the lever can run in parallel.
I don't understand what you mean. What do you see as a problem?
@seanyoung Consider this case:
file A.sol:
contract A { ... }
file B.sol:
contract B is A { ... }
file C.sol
contract C {
A other;
function foo(address addr) external {
other = new A{address: addr}();
}
}
I can invoke Solang using solang compile --target Solana A.sol B.sol C.sol
File B.sol depends on A.sol. The semantic analysis can only happen for B after that contract A is fully resolved, even though they might generate different binaries. Parallel compilation for A and B is not possible.
For file C.sol, the contract needs to have contract A resolved. In addition, the Solana account collection in codegen expects the CFG from all contracts to be ready in order to collect accounts for function foo. Parallel compilation for C and A is not possible again.
The way I see, we either can enable parallel compilation and let the compiler do repeated work for these cases (e.g. resolve A.sol solely for B.sol in one thread to generate B's binary, while A.sol is building in another thread to generate A's binary), or we need to construct a dependency tree to identify what can be parallelized and use many synchronization mechanisms throughout the code to make this work.
File B.sol depends on A.sol. The semantic analysis can only happen for B after that contract A is fully resolved, even though they might generate different binaries.
This is not how Solang works and it could never work that way.
Each file on the command line is new Namespace. When a file is imported, we call sema (recursively) with the existing namespace and then walk the parse tree of the imported file. So, the parse tree for the same file can be used concurrently in different threads.
You are suggesting that when B.sol imports A.sol, then it uses the Namespace of A.sol rather than the parse tree. That would be wrong and will lead to incorrect compilation. Each import needs to go through sema for its own Namespace.
There are global things like user defined types which could have different definitions in different files. When you then import another file, that imported file needs to use the correct global definitions.
So, sema and the following stages can run in parallel. Since we're using an lalr grammar, the parser stage should be pretty fast so I suspect this will make little difference.
So, sema and the following stages can run in parallel. Since we're using an lalr grammar, the parser stage should be pretty fast so I suspect this will make little difference.
I apologize. I wasn't aware that Solang worked that way. By building both A.sol and B.sol, the compiler is doing repeated work resolving contract A, isn't it? Shouldn't we resolve A only once?