cs6120 icon indicating copy to clipboard operation
cs6120 copied to clipboard

Project Proposal: SCIF Compiler Improvement

Open calciiium opened this issue 6 months ago • 4 comments
trafficstars

What will you do? We will write an improved version of SCIF compiler that supports multiple contract compilation, provides user-friendly error messages, refactors and documents the inherited codebase, besides ensuring correctness on both execution and information flow labels.

How will you do it?

  • Supports multi-contract compilation, enabling inter-contract calls and interface definitions under a single file.
  • Improve the developer experience with more user-friendly error messages for information flow constraint violations, replacing missing/incomplete/misleading diagnostics with clear, actionable error messages. Work to reduce instances where the compiler crashes.
  • Raise the quality of the compiler to make it open source, building it for the public release. Create a compiled artifact that can be run without gradle and other dependencies. Refactor and document the existing codebase to improve maintainability and extensibility, by removing dead/commented code, enforcing module boundaries, and codifying explicit specifications on the fly.

How will you empirically measure success?

  • We will run test programs containing multiple contracts in one file and ensure its correctness.
  • We will show a comparison between old and new error messages and entail the effectiveness and usefulness of new diagnostics when error shows up.
  • We will use the number of IntelliJ warnings and number of specifications (currently < 10) to measure the maintainability and extensibility of the codebase.

Team members: @noschiff @KabirSamsi

calciiium avatar May 06 '25 03:05 calciiium

Overall, this seems like a relatively compact set of engineering tasks that you can feasibly get done in one week. It will not be as substantial a full-length project that started in March, so the outcomes are pretty limited, but that's life!

About this part:

Improve the developer experience with more user-friendly error messages for […]

Can you please give an example or two of the current error messages and the new error messages that you will emit? I would like to know the scope of the proposed changes here to get an idea of what actual technical work this will require.

Similarly:

We will run test programs containing multiple contracts in one file and ensure its correctness.

Can you elaborate on where you will get these programs? Will you need to write them yourself? If so, how many will you write, and how complex will they be?

sampsyo avatar May 06 '25 12:05 sampsyo

Consider the following SCIF code.

import "B.scif";

contract A {
    B{any} addr;
    constructor() { super(); }

    @public void foo() {
        addr.bar();
    }
}
interface B {
    @public void bar();
}

The output from SHErrLoc gives:

/Users/noah/Cornell/SCIF/compiler/./.scif/foo0.ifc | <---- Static type error. Code locations most likely to be wrong:

  1. main.scif, line 8, column 14: Target contract must be trusted to call this method. addr.bar(); ^

  2. main.scif, line 4, column 12: Variable addr may be labeled incorrectly. B{any} addr; ^

The constraint comes from this unfortunately poorly documented, code:

        if (externalCall) {
            env.cons.add(
                    new Constraint(
                            new Inequality(ifContRtn, ifFuncCallPcBefore.toSHErrLocFmt(dependentLabelMapping)),
                    env.hypothesis(), location, env.curContractSym().getName(),
                    "Target contract must be trusted to call this method"));
        }

The first diagnostic message is really confusing on which information flow constraint it represents. Error messages should somewhat correspond to the static typing rules in the SCIF core calculus, give insight into why it's a problem, and ideally specifically point out what it's referring to, like which contract and which method.

This is a bit of an extreme example because I'm still trying to fully understand this constraint, but that's more in the research scope than specifically compiler hacking. For this project, I imagine changing this error message to be more specific to align with the SHErrLoc constraints and what these variables actually represent.

noschiff avatar May 07 '25 02:05 noschiff

Thank you for the feedbacks!

We will run test programs containing multiple contracts in one file and ensure its correctness.

Can you elaborate on where you will get these programs? Will you need to write them yourself? If so, how many will you write, and how complex will they be?

We have ready-to-use test programs ranging from basic contracts to actual, useful contracts (~300 LOC). These can serve as test cases to guarantee program correctness while changing the code. For multi-contract compilations testing, we will write or modify > 10 test programs, which includes: basic two-contract under single file, inherited contracts under single file, and real-world Uniswap contract setup using ERC20 tokens. We will also generate negative test programs for every error message improvement we work on.

calciiium avatar May 07 '25 05:05 calciiium

OK, it's a plan.

sampsyo avatar May 07 '25 21:05 sampsyo