marker Describe the difference of this project with `dylint`

I'd suggest leaving a word about comparing this project with dylint. I was considering dylint, tried it out and found it too complex.

This approach:

#![feature(rustc_private)]

extern crate rustc_ast;
extern crate rustc_span;

feels really odd. Gigen that these things are not public, and they don't have friendly documentation, even in-code documentation, and they have not enough similarity with syn so that a person who worked with proc macros could not easily integrate with it.

I would prefer if the ecosystem had a simple linting library (not even a CLI with dlopen-ing stuff), where lints could be statically linked, the API they use have well-documented public interface that is not dependent on rustc internals (at least on the public surface).

Basically a linting library like this with inversion of control:

struct MyLint1;

impl my_imaginary_awesome_linting_library::Lint for MyLint1 {
    // ...
}

fn main() {
    let lints: [Box<dyn my_imaginary_awesome_linting_library::Lint>; _] = [
        Box::new(MyLint1)
    ];

    my_imaginary_awesome_linting_library::builder()
        .lints(lints)
        .run_cli();
}

I am almost sure you guys have considered this approach, or maybe you think dynamic linking and the approaches you have are much better. But, for now, I'd like to understand how this project solves the pain points described above of dylint or if it solves them at all. If not, then what would be the difference with dylint here, and why not merge the efforts with dylint if these projects follow the same architecture?

Best would be to document this in a README.md

Jul 15 '23 15:07 Veetaha

Hey :wave:, I've just updated the main readme. It doesn't contain a comparison with dylint, but it specifies the features and goals of Marker. I think the main difference is what the projects are trying to archive and what requirements they have.

First, as a baseline, it's worth mentioning, that both (dylint and marker) use rustc as a driver (The part that does lexing, parsing, type checking and lint emission etc). Rustc's official stance is that the internals are unstable and there is no plan to stabilize them.

My understanding is that dylint's goal is to allow dynamic crates to link to rustc internals. This can then be used for linting. This linking and loading process should be simple.

Marker tries to build a stable interface designed for linting or other kinds of static analysis. The API should also be driver-independent, to make it future-proof (and cause headaches during development ^^). And the API should be simple to use, meaning that everything public should be well documented and types should be well-designed. This last definition is hard to measure, it basically means that lint crate development should feel natural, types should make it clear what everything is and everything should be easy to find.

Dylint has the advantage that it binds directly to rustc and only passes information along. Development wise, its ready for action. It also has the advantage that you can access everything within rustc, while crates using Marker will be limited by Marker's interface. Marker is also still in the early stages of development, things like async and await expressions are currently not supported. It's also not yet possible to check if a given type implements a trait.

If you want to create a lint right now and use it in production, I would suggest using dylint and wait for Marker. From my perspective, it would be really cool if you could run marker on your project and document any problems that come up. My plan it to release version v0.1.0 soon, to collect feedback. The API is still lacking several parts and adjustments. So, it's a long shot from being done or stable. But any kind of feedback helps. I can ping you, once the first test version has been released if you like.

I'm also working on a lint crate template, that I can link to. It should be available tomorrow. :)

I would prefer if the ecosystem had a simple linting library

In short, this is precisely what I'm aiming for with Marker :)

(not even a CLI with dlopen-ing stuff), where lints could be statically linked. [...]

I am almost sure you guys have considered this approach, or maybe you think dynamic linking and the approaches you have are much better.

Dynamic linking has its problems, like requiring everything to be #[repr(C)]. Currently, it's just the best I could come up with. WebAssembly sadly doesn't work too well, since ASTs heavily rely on pointers, which are not transferable. Generating a crate that statically links to all lint crates was a thing I considered. However, I wrote it off, as the code generation would probably be a pain and it would mean that any change in your lint crate selection would trigger a total recompilation of the driver etc.

This is not to say it can't be done. The current API should allow us to later switch to static linting if we wanted to. This might be especially interesting for IDEs, as speed matters more there.

Would you prefer a static library approach? And could you maybe explain why? :)

Jul 15 '23 16:07 xFrednet

Thanks for the reply 👍

I subscribed to releases in this repo, so if you plan to publish 0.1, I'll see it in my notifications.

Regarding the static linking approach I think it's just a simpler way for people to use that. This is basically reminiscent of the approach that rust's test and benchmark frameworks use. The proc macro #[test] or #[bench] that collects the code that you want to run, and then a test/benchmarking harness they you need to manually invoke in your main() that provides you with the CLI over the "tests/benches" you wrote.

I think it would be cool to have a similar API like this that would make writing lints almost as writing tests but with the caveat that tests test your runtime behavior, while lints test your compile-time behavior:


#[lint(/* some configs here maybe */)]
fn code_is_tidy_from_todos(ctx: LintContext) {
    let todos = ctx
         .file
         .comments()
         .filter_map(AstNode::into_comment)
         .filter(|comment| lazy_regex::is_match!("// (TODO|TOOD):",  comment.text));
         
    for todo in todos {
         ctx.error(
             todo.span,
             "All todo's must be resolved before committing code to master! \
              Replace them with FIXME if you don't want to resolve them before the merge"
         );
    }
}

Then with the usage of the inventory crate the main function could run all lints in the linked crates (basically here is our linting harness):

fn main() {
    // inventory crate will magically collect all `#[lint]`s that were defined in this compilation
    marker::builder().run_cli()
}

Maybe instead of run_cli() the builder could have ways to invoke lints directly making it possible to integrate marker basically everywhere as a library. For example, to invoke linting as part of #[test] for a proc-macro crate that generates code, and wants its generated code to pass linting as well.

These are all just my high-level ideas. Unfortunatelly I don't have enough time to implement everything in my head, so I am just dumping it all here such that maybe someone else finds them useful.

Btw. I just tried to install cargo-marker according to the docs and got this problem:

$ cargo install cargo_marker
    Updating crates.io index
error: could not find `cargo_marker` in registry `crates-io` with version `*`

If you want to ping me to test cargo-marker on my private repo, you may ping me elsewhere in Github, I keep track of github notifications on working days daily. My private repo consists of more than a hundred crates in a single cargo workspace with the heavy use of async (it's a cloud-native distributed system), custom proc macros, declarative macros and any other rust feature there is. I can't guarantee I'll reply, but will try to find time for helping with real-world-codebase testing.

Jul 15 '23 17:07 Veetaha

Regarding the static linking approach I think it's just a simpler way for people to use that. This is basically reminiscent of the approach that rust's test and benchmark frameworks use. The proc macro #[test] or #[bench] that collects the code that you want to run, and then a test/benchmarking harness they you need to manually invoke in your main() that provides you with the CLI over the "tests/benches" you wrote.

This idea hasn't been considered yet. For now, Marker will probably stay with dynamically loaded libraries, as it works and completing the API has the higher prio. But this should at least be discussed before v1.0.0. I've added part of your comment and a lint to it to https://github.com/rust-marker/design/issues/26

Thank you for sharing!

Jul 18 '23 13:07 xFrednet