dino icon indicating copy to clipboard operation
dino copied to clipboard

Figure out design for Package

Open sunjay opened this issue 4 years ago • 1 comments

Notes:

  • DefId contains both PkgId and DefIndex
    • This is good because each package can have its own set of IDs
    • Trouble is figuring out how to define the values at those IDs
  • Need to figure out how the type checking IR will look
  • Need to figure out how translation into machine code will look
    • This will help us figure out how to store codegen data in Package
  • Need some way to structure Package so you can lookup by name and by DefId
    • Package needs to store mangled names, generated constructor functions, etc.
    • Problem: if Package stores a different type in DefStoreSync than nir, how do we create a Package at the end of compilation?
    • Package does NOT need to store names of variables, but nir does
  • The Packages struct needs to support looking up DefId
  • Noticed that since resolve::Scope has separate tables for variables, functions, and types, the DefData stored in those tables will always be of the same kind
    • That is, the variables table will only contain entries that map to DefData::Variable
    • At face value, that is fine, because the backing DefStore contains mixed values of DefData

sunjay avatar Mar 29 '20 20:03 sunjay

After a lot of thought and several attempts at this I have figured out that the problem is not the design of package, but the design of the overall scope data structure as well as the design of NIR.

Notes

struct Scope {
    id: ScId,
    // Root scope/module has id == parent
    parent: ScId,
    /// The kind of scope this is (used during traversal)
    kind: ScopeKind,
    /// The "level" of scope in the scope tree
    ///
    /// A name is shadowed if any parent scope at the same level has that name.
    ///
    /// Invariant: in a chain of scopes at the same level, only the top most scope may introduce imports, types or functions. Lower scopes may only introduce variables. This is necessary to represent the idea that declarations are implicitly hoisted up to the top of a given level of scope. 
    level: usize,
    /// A list of scopes from which any name may be used in this scope or any child scope
    wildcard_imports: Vec<ScId>,
    /// A submodule is simply a subscope
    modules: HashMap<Arc<str>, ScId>,
    /// Names visible in a type context
    types: HashMap<Arc<str>, DefId>,
    /// Names visible in a calling context and an expression context
    functions: HashMap<Arc<str>, DefId>,
    /// Names visible in an expression context
    variables: HashMap<Arc<str>, DefId>,
}

struct ScopeTree {
    /// Each ScId corresponds to an index into this list of scopes
    scopes: Vec<Scope>,
}

struct Package {
    id: PkgId,
    root: ScopeTree,
}

NIR needs to be updated to return something closer to HIR. Names are replaced with DefIds, but everything else is still preserved. All scopes should be labeled with their ScId.

Test: should be able to implement unused variable/struct lint on the resulting data structure.

The package format is simply a tree of scopes with only modules, types, and functions.

Question: should modules have DefIds?

Question: what does it mean to have mod in an inner scope (like inside a function). Does Rust allow this?

Question: should we model that only certain scopes will contain certain kinds of definitions?

enum ScopeKind {
    Module { wildcard_imports, types, functions },
    /// Any type names (i.e. from generics) and the `Self` type introduced by the impl
    Impl { types, self_ty: DefId },
    /// Any type names (i.e. from generics) or variables (i.e from parameters) introduced by the function signature
    Functions { types, variables },
    /// The start of a block
    ///
    /// All declarations are hoisted to the top of a block. That makes them available for everything else in the block.
    Block { wildcard_imports, types, functions, variables },
    /// The continuation of a block after a subscope
    ///
    /// The parent of a scope with this kind may have either the same kind (`BlockVars`) or `Block`.
    BlockVars { variables },
}

This would remove the need for the level field. The code to search the scopes would be a little more complicated but perhaps in a way that increases expressiveness/readability?

Question: How does lookup work?


  • [ ] define primitives
    • [ ] define primitive methods (fn signatures in scope, self/no self)
    • [ ] might be that primitives and std need to be stored in a higher level format that defines modules in terms of DefIds (in that case, the current package module would mostly move to nir)
    let unit = ...; // unit def id
    let int = ...; // int def id
    
    Package {
        name: "".into(), // primitives package is unnamed and implicitly available in all scopes
        id: pkg_id,
        types: hash_map! {
            "()".into() => Type {
                def_id: unit,
                mangled_name: "DUnit".into(),
                methods: Vec::new(),
            },
            "int".into() => Type {
                def_id: int,
                mangled_name: "DInt".into(),
                methods: vec![
                    Function {
                        name: "add".into(),
                        mangled_name: "int__add".into(),
                        self_param: true,
                        params: vec![
                            FuncParam {
                                name: "other".into(),
                                type: int,
                            },
                        ],
                    },
                ],
            },
            ...
        },
    }
    
    • [ ] this format can then be transformed into a scope tree as needed by name resolution and also transformed into the info needed by type checking and code generation
    • [x] remove src/nir/ty.rs?
    • [ ] package format must include use decls because they are relevant to imports
    • [ ] pretty sure at this point that package format will resemble the def table
  • [ ] start to define std
  • [ ] update nir
  • [ ] update resolve
    • [ ] implement path lookup
    • [ ] importing a name that is both a type and a function will result in both the type and function being imported
  • [ ] can DefKind be completely removed? This would make DefStore just a counter that produces DefIds
  • [ ] Add print_$prim functions to std::$prim module and then re-export those modules from prelude
  • [ ] Add prelude import to the top of every module
  • [ ] #[link(name="Dbool")] extern bool { syntax for defining extern type name in define_prim!
  • [ ] #[link(name="bool__eq")] fn eq(self, other: bool) -> bool; syntax for defining extern method name in define_prim!

sunjay avatar Apr 11 '20 03:04 sunjay