julia icon indicating copy to clipboard operation
julia copied to clipboard

Add `strict` mechanism for opting into stricter subsets of the language

Open Keno opened this issue 1 year ago • 22 comments

We've had a few discussions the past few weeks about a feature tentatively dubbed pragma strict after similar constructs in other languages. However, there wasn't really a cohesive writeup of the intent, so triage asked me to write one up to serve as the basis for discussion and fleshing out. I intend to edit this issue as the idea evolves.

Basic idea

The basic idea of the pragma strict feature is to have an opt-in mechanism of turning julia programs that are semantically valid, but undesirable for other reasons (e.g. using ambiguous syntax that should have arguably been disallowed, but we can't for backwards compatibility reasons) into errors. This would be an opt-in feature for developers who have personal, organizational or regulatory requirements for requiring stricter coding standards. An additional motivation is to provide an additional vehicle for low-frictition language evolution. For example, if a specific opt-in turns out to be popular across the majority of packages, a potential julia 2.0 that made the opt-in automatic while technically breaking, would be largely non-breaking in practice.

We are not imagining a single strict mode opt in here, but rather a finer grained set of options, plus versioned collections of options for particular use cases. See the last section for a an initial list of such options.

It is worth emphasizing again that this feature is only intended to disallow undesirable programs that are otherwise semantically valid. It is not intended to cause meaningful semantic differences in programs that are valid both in standard semantics and under the opt-in restrictions (i.e. turning on the restrictions may cause things to error, but if they don't the program should behave the same).

How does the opt-in work?

One of the primary questions in this proposal is how the user expresses the opt-in. There's a few separate semantic options, each with a number of potential syntax options.

  1. Per module opt-in like our existing Experimental.@compiler_options
  2. Per file opt-in (e.g. using a magic comment on the first line) - popular in some other languages
  3. Per project opt-in in Project.toml

After some discussion on triage, a Project.toml-level opt-in seems like the best option. The primary motivation here is to allow opt-ins that need to be done in the parser (e.g. whitespace requirements). We don't currently define the execution ordering of parsing and execution for packages, so a module-toplevel opt-in may be semantically too late (relatedly, it may be ambiguous what happens when the opt-in is placed in the middle of a disallowed parse). An additional concern is that ideally IDE tooling would be able to understand the active set of restrictions without having to look at the code.

Concrete Project.toml syntax options

One convenient option would be reusing Preferences.jl. One might imagine a julia-level preference like:

name = "MyPackage"

[preferences.julia]
strict = ["nomultiassign", "uniqueidentifiers"]

This doesn't fully mesh with the usual preferences semantics, since preferences are ordinarily uniqued per-UUID while they would be private for a particular package, but this might be ok. Alternatively, we could reserve the strict key in each individual package's preference table:

name = "MyPackage"

[preferences.MyPackage]
strict = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

Alternatively, we could have a new top-level strict section:

name = "MyPackage"
[strict]
julia = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

Initial idea list for opt-in options

In this section, I'm collecting a list of potential options that might be implemented. However, I am not at this point asking people to brainstorm all the possibilities that could be implemented. I'm also not asking for detailed discussion on what should or should not be included in a particular option. Rather, I wanted to have a place to list all the ideas that have already come up and a place to link any issues that could be addressed by this feature. Full design discussions for individual flags can be had on the PRs to implement them once the overall mechanism is in place.

Individual options

  • nomultiassign

Disallows multiple assignments in the same expression without parantheses. I.e. disallows a = b, c = d, e, = f = (1, 2)

  • nolocalshadow

Disallows shadowing of local variables, e.g. in the following


function foo()

	for i = 1:10

		for i = 1:10 # Error shadowing local `i` 

		end

		all(1:10) do i # Error shadowing local `i`
			iszero(i)
		end
	end
end
  • noglobalshadow

Disallows shadowing of global variables, e.g. in the following:

function foo()
	missing = false # Error local `missing` shadows imported global `missing`
end
  • Some variant of unique assignment

Stefan had proposed introducing a unique assignment operator, e.g. := for which there would then be a corresponding opt-in to enforce all assignments use it

  • Enforce export versioning

If we implement some variant of export versioning, there could be an opt-in forbidding unversioned exports.

  • Enforcing import for types: #25744

  • Requiring explicit undef markers for undefined variables in new.

  • https://github.com/JuliaLang/julia/issues/12069

  • https://github.com/JuliaLang/julia/issues/14952

Collections

The idea of collections is that users in general don't want to individually decide which opt ins matter to them, but will likely be following a standard set by their organizations or prescribed by a style guide. To this end, there could be meta opt-ins like "basestyle", which would activate a standard collection of opt-ins. These collections should be versioned and activated based on the min-compat version of Julia. In this way, new opt-ins can be added to a collection, without automatically activating them on a julia version upgrade.

Keno avatar Jun 23 '24 12:06 Keno

I would suggest that one of the individual options should be disallowing control flow in non-"statement" position, i.e. https://github.com/JuliaLang/julia/issues/50415

adienes avatar Jun 23 '24 14:06 adienes

https://github.com/JuliaLang/julia/issues/51223 is my proposal for := reassignment.

jariji avatar Jun 23 '24 20:06 jariji

I like (Stefan's?) idea that whitespace must match operator precedence so you can't write 2 * 3+1, you have to write 2*3+1 or 2*3 + 1 or 2 * 3 + 1.

jariji avatar Jun 23 '24 21:06 jariji

Another thing we had talked about was having a set of defaults based on a version, which you could add to or subtract from which might look like this:

strict = ["1.12 defaults", "no local shadow", "-explicit imports"]

Would there be strictures besides the ones provided by Julia itself? Not clear on why there a strict section and a "julia" key in your examples. Wouldn't a single strict entry with a list of values suffice?

StefanKarpinski avatar Jun 23 '24 21:06 StefanKarpinski

Some of the configurations should probably disallow accessing non-public names, including:

  1. Names of instance properties (ref propertynames). Wouldn't affect types defined in the same package.
  2. Names in a module (ref names). Wouldn't affect modules in the same package.

IMO this should be opt-out for all packages, but shouldn't affect the REPL.

nsajko avatar Jun 23 '24 21:06 nsajko

What do you think about having these subsets be installable packages so users can contribute their own rules, rather than having an official set of rules?

jariji avatar Jun 23 '24 21:06 jariji

One question is whether this needs to be implemented in Julia itself, or whether this belongs more in a linter like tool. At some level it strikes me that if this is a thing, we most definitely would want to implement support for this in things like the language server. And then the question: what is gained by having two implementations?

davidanthoff avatar Jun 25 '24 12:06 davidanthoff

And another idea for potential use-cases: relative to more statically typed languages, it is really difficult to provide the kind of robust IDE experience from a language server that languages like TypeScript, Rust or C# have. But maybe there is a scenario where one could actually provide the same kind of robust IDE support if one was willing to avoid some of the more dynamic language features that Julia has. Obviously, that would be a terrible default, but I certainly have packages where I don't need many of the dynamic features of Julia and would much like to have an experience that is more statically typed. Not sure whether that is really feasible, but maybe worth exploring, and this strict type feature might be a good way to opt into a mode that gives one a statically typed IDE experience.

davidanthoff avatar Jun 25 '24 13:06 davidanthoff

@davidanthoff maybe I'm wrong, but I have a feeling you may have this backwards. My feeling is that the only way for the language server to become a clear win for users (currently it's quite annoying with the false positive warnings) is to plug into the Julia implementation quite directly, maybe similarly to Cthulhu.jl. So maybe the language server for Julia should be just a thin wrapper around Julia.

nsajko avatar Jun 25 '24 13:06 nsajko

@nsajko probably best to stick with Keno's suggestion to collect ideas here but not discuss or evaluate them in detail, that would presumably just distract from the topic of this issue. Having said that, if you have ideas and thoughts about the LS, please open an issue over at it's repo and we can discuss there.

davidanthoff avatar Jun 25 '24 16:06 davidanthoff

I agree that per-project is the best approach.

I agree that Project.toml is the place to put this opt-in and configuration.

Concrete Project.toml syntax feedback

I think a toplevel strict is the simplest approach to avoid confusion over subtle inconsistencies with Preferences.jl's preference resolution.

name = "MyPackage"
strict = ["noreassignment", "nomisleadingwhitespace"]

Additionally, this strictness is a property tied to the package about as closely as it's name and version—it's likely that a project declared without strict rules will fail to parse with them. The closest analog to this feature I know of is Rust editions. In Rust, that field is stored in the [package] table, which serves a analogous role as the toplevel table in our Project.toml files.

Reserving the strict key in each individual package's preferences is a bit breaking. It's also unclear what it means to set the strict preference of any package other than the one named by the Project.toml file. Syntax that enables this seems problematic:

name = "MyPackage"

[preferences.OtherPackage]
strict = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

A toplevel [strict] section seems unnecessarily verbose compared to a toplevel strict key.

LilithHafner avatar Jun 28 '24 22:06 LilithHafner

provide an additional vehicle for low-frictition language evolution

Yes! We need this for syntax evolution which is often technically breaking but not actually very breaking at all in practice. There's so many examples of this. Some being #36547 and #54915. In https://github.com/JuliaLang/julia/issues/36547#issuecomment-1449143117 I show that several bugs would be fixed by this syntax change. But the change itself is nevertheless, technically breaking and it's a really tough call to decide whether to do it.

Lilith has already mentioned Rust Editions. A core part of Rust editions are that they don't bifurcate the ecosystem because crates with using different editions can work together. We should definitely do that to avoid a python 2/3 style debacle.

For example, if a specific opt-in turns out to be popular across the majority of packages, a potential julia 2.0 that made the opt-in automatic while technically breaking, would be largely non-breaking in practice

For a lot of minor syntax improvements, I think they'd best be expressed as "use the latest syntax as of Julia version 1.x" rather than as opt-in flags. If we're trying to improve the syntax to change/remove ambiguous or confusing syntactic constructs, we want an incentive for the whole ecosystem to drop the old syntax. For example if we do the change @JeffBezanson mentioned here https://github.com/JuliaLang/julia/issues/54915#issuecomment-2235198794 in a Julia edition we do want packages to drop the old syntax as soon as possible.

"Use Julia 1.x syntax edition" is great from this point of view:

  • Users have an incentive to opt in because they get the latest syntax niceties (for example multidimensional array literals)
  • They pay the minor price of updating old syntax (can be automated!)
  • Users of the ecosystem benefit from most packages being on the latest syntax version

So I think "syntax evolution" should not be fine grained, where at all possible - it's a bit different from the other "strict mode" things which are mentioned above, where users might want fine-grained control over opting out of certain language constructs.

c42f avatar Jul 18 '24 22:07 c42f

xref #43654 #46411 #52014 #55304

nsajko avatar Jul 29 '24 18:07 nsajko

xref #50040

nsajko avatar Aug 01 '24 12:08 nsajko

https://github.com/JuliaLang/julia/issues/48434 seems like another candidate for Julia Editions - not so much "strict mode", but a change to lowering semantics. It's a reasonable candidate because we can detect most semantic changes introduced by changing the try scoping rules and provide help for people porting their code either as a diagnostic or automated refactoring pass. (The tricky part would be changes to semantics in top-level code - we can probably still deal with this using static analysis for most packages but there will always be hard cases.)

c42f avatar Aug 12 '24 00:08 c42f

rustc compiler handles things this way. see https://rustc-dev-guide.rust-lang.org/implementing_new_features.html https://rustc-dev-guide.rust-lang.org/feature-gates.html interop with the ast is handled too https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast_passes/feature_gate/fn.check_crate.html there is macro for this (w rust macro_rules eg composable matchable macro )

o314 avatar Aug 13 '24 12:08 o314

regarding reassignment:

As I think some others also mentioned in https://github.com/JuliaLang/julia/issues/51223 , I think it makes sense to have x := 1 or local x = 1 for initial variable declaration + assignment / variable definition and x = 1 for any reassignment rather than the other way around (:= for reassignment).

sadish-d avatar Dec 19 '24 05:12 sadish-d

What about using a scoped approach like @stable/@unstable in DispatchDoctor.jl? Like:

Base.@strict :nomultiassign :nolocalshadow begin

#= package code =#

end

It recurses through include and submodules too.

This way you could improve parts of your code at once, rather than needing to update the entire codebase in a single patch.

And perhaps sometimes you just need to disable it just for one single function, so you can turn it off with

Base.@nostrict begin

my_hacky_function() = #= ... =#

end

which toggles the outermost @strict for the scope.

You can use Preferences.jl to set defaults for the @strict scopes so you can configure options from the top level. This is the same way DispatchDoctor.jl uses Preferences.jl to provide defaults for codegen level and how many Unions to consider types unstable, which has worked really well I think.

This is also the same way clippy works in rust. You can set overall rules for a single project from Cargo.toml:

[lints.clippy]
enum_glob_use = "deny"

but then ALSO set them locally:

#![warn(clippy::all, clippy::pedantic)
{

// Block of code

}

MilesCranmer avatar Jan 05 '25 12:01 MilesCranmer

xref #57311

nsajko avatar Feb 09 '25 19:02 nsajko

My two cents on what should probably go inside a strict mode:

I think one should avoid characterize the strict mode by putting together many small admonishments (e.g. no argument should have type Any). That's what linter does and it would make things unclear about what strict mode promises.

Instead, I think it's better to structure strict mode around some "cohesive" promise (e.g. make code more "inferable" by requiring type annotation, not allow dynamic dispatch except on those labelled abstract types, etc). I hope working out what to add to strict mode in this way will give a more concrete sets of rules to enforce and relatively few "strict mode flags" to turn on, instead of a long list of admonishments.

LEXUGE avatar Apr 02 '25 23:04 LEXUGE