cairo
cairo copied to clipboard
feat: add CASM serialization of Cairo programs
This PR adds a new command sierra-compile-json which outputs a JSON with everything needed to run a Cairo v2.6.4 program in any environment (i.e. outside of Rust).
Rationale: There is currently no straightforward way to execute a Cairo program outside of a Rust project. It is much needed to provide other implementations of the Cairo VM a way to run Cairo programs the same way StarkNet contracts can.
I believe that such feature belongs to the Cairo compiler rather than an external project (e.g. Universal-Sierra-Compiler).
It follows the same structure as *.compiledContractClass.json output from the command starknet-sierra-compile.
The fields of the *.casm.json json output are:
- prime
- compiler_version
- bytecode
- hints
- pythonic_hints
- entrypoint
- builtins
The last two fields are the exact same as offset and builtins from the elements of entry_points_by_type (besides the selector)
When using the command sierra-compile-json, a flag gas-usage-check enables the GasBuiltin, as this builtin is not mandatory for Cairo programs(?)
I haven't added the bytecode_segment_lengths as it is only used in StarkNet contracts with multiple entrypoints (to be confirmed, might be a wrong assumption).
Example *.casm.json outputs can be found here
crates/bin/sierra-compile-json/src/main.rs line 22 at r1 (raw file):
/// Add pythonic hints. #[arg(long, default_value_t = false)] add_pythonic_hints: bool,
is this needed?
Code quote:
#[arg(long, default_value_t = false)]
add_pythonic_hints: bool,
crates/bin/sierra-compile-json/src/main.rsline 22 at r1 (raw file):/// Add pythonic hints. #[arg(long, default_value_t = false)] add_pythonic_hints: bool,is this needed?
No it can be removed (was thinking of making the pythonic hints optional at first)
crates/cairo-lang-casm/src/ap_change.rsline 12 at r1 (raw file):mod test; #[derive(Copy, Clone, Debug, Eq, Hash, PartialEq, Serialize, Deserialize)]why is this needed?
Code quote:
, Serialize, Deserialize
Before returning the builtins and entrypoint fields I was serializing the ::main Sierra Function where I needed to add all these serde traits.
So it is not needed anymore, cleaned up
I've added two fields, input_args_type and return_type:
input_args_type: Array containing the (explicit) arguments type of themainfunction, if any.return_type: Array containing the return type of themain, if any Otherwise, these fields are empty arrays.
Given the current schema
Cairo > starknet
prime
compiler_version
bytecode
hints
pythonic_hints
entry_points_by_type
Cairo0
attributes
builtins
compiler_version
data
debug_info
hints
identifiers
main_scope
prime
reference_manager
The proposition currently is
prime
compiler_version
bytecode
hints
pythonic_hints
entrypoint
builtins
input_args_type
return_type
I'm wondering if we could keep the list of entry points and not only keep the main one, so that one compiled file can have several entrypoints
- It seems to me that you are trying to create a non-contract CASM format that would be exposed in public compiler API and then used by other project (mainly non-rust VMs), is that right? If so then I'm all for it, but:
Yes, the goal is to have a standardized format that would be used by the different VM. It can also benefits the Rust VM, as it currently takes a .cairo or .sierra file and perform these compilation steps before running the program.
- did you consult it with developers of other projects? I think we don't want to end up with a format that is specific for your project and resides in the compiler repo
Totally agree that the format should be project-agnostic.
I've shared with the other Cairo VM projects this PR, and I've been talking about such artifacts with @TAdev0 and @rodrigo-pino from NetherMind (cairo-vm-go) but not about the actual format yet.
- are you sure the information in the format are sufficient and the format won't need breaking changes?
In its current state I'm not 100% convinced that it is sufficient. For example the entrypoints and the data to return about explicit input arguments and the return type (only their type or also their size?, should the pythonic hints should be optional? (defaulting to false), etc).
I've opened this PR to start the discussion on a 'non-contract CASM' format standard, providing a basis to iterate over.
- This PR seems to contain some weirdly specific logic e.g. extracting "::main" function entrypoint (why this one? What about other functions?). Can you explain a rationale behind this?
At first I thought that the sole entrypoint of a Cairo program would be its main function, but this is inaccurate.
So every functions should be included, grouped in a similar way as the entry_points_by_types in starknet-sierra-compile
We could have a similar field, such as entry_points:
{
"hints": [],
"entry_points": {
"main": {
"builtins": [],
"offset": 0,
"input_args_type": [],
"return_type": []
},
"foo": {},
"bar": {}
}
}
crates/cairo-lang-sierra/src/program.rs line 154 at r3 (raw file):
.iter() .find(|f| { if let Some(name) = &f.id.debug_name {
Can we avoid using the debug_name here?
Code quote:
debug_name
crates/cairo-lang-sierra-to-casm/src/compiler.rs line 181 at r3 (raw file):
for type_id in main_func.signature.param_types.iter() { let debug_name = match type_id.debug_name.clone() {
This is supposed to be used for debugging. it is better not to relay on that.
Code quote:
debug_name
Can we avoid using the debug_name here? This is supposed to be used for debugging. it is better not to relay on that.
Definitely, I'll propose something similar to what is done in casm_contract_class.rs to extract builtins
I was also thinking about having a similar selector field which would encapsulate the signature params but I still need to put more thoughts on it
@zmalatrax please use Reviewable for responding to comments :)
crates/cairo-lang-starknet-classes/src/casm_contract_class.rs line 204 at r1 (raw file):
/// Context for resolving types. pub struct TypeResolver<'a> {
Moving this to a new file can be a separate PR, right?
Code quote:
pub struct TypeResolver<'a> {
Hi @Arcticae , I work on the Cairo VM in Go being developed by Nethermind. From our side, we were hoping this PR would get merged so that we could use the serialized output for Cairo 1 programs as input for our VM. Currently, we are not using the output provided by USC because it has missing information that is required for the execution, such as the builtins and the entry_points. For us, adding this functionality here makes sense from a usability perspective, specially since it's already generated as part of the existing pipeline and it allows to obtain all necessary components from a single source.