julia icon indicating copy to clipboard operation
julia copied to clipboard

Enable JuliaSyntax.jl as an alternative Julia parser

Open c42f opened this issue 1 year ago • 3 comments

This enables the use of JuliaSyntax.jl as the Julia parser for the runtime, which greatly improves parser error messages in various cases. As an experimental feature, this is only enabled if the JULIA_USE_NEW_PARSER=true environment variable is set.

20220817_17h14m06s_grim

For now I've installed it after bootstrapping because this is the simplest (flisp parser still used for bootstrap). In the future we can solve bootstrapping but this isn't necessary right away.

API Versioning: Stdlib woes vs vendoring

The simplest way to get this working immediately was as a stdlib. However in retrospect this has several downsides:

  • As only version JuliaSyntax-0.1, the JuliaSyntax API is not at all stable. The only stable API for using this from Base should be Meta.parse() and Expr. Therefore, I think users should not really be able to do using JuliaSyntax and get the version bundled with Base.
  • For use in tooling like the language server, it's necessary for the JuliaSyntax API (custom syntax tree types etc) to be able to evolve with its own versioning scheme independent from whatever version of Base someone is using. Users should be able to install a "normal" version of JuliaSyntax via Pkg.

So I think having this as a stdlib doesn't really make sense and we are left with some kind of vendoring approach. I think what I'll try next is to include the parser at the end of building Base.

Having said all that, it should be easy to try this out already :-)

Fixes #31449

c42f avatar Aug 17 '22 07:08 c42f

Can we drop the implied Mmap usage and dependency? Doing mmap leads to segfaults and bus errors, where a read would have just gotten a permission error or file truncation error, and can lead to other problems like that, with cleanup and continued edits by the user on the original file.

vtjnash avatar Sep 15 '22 19:09 vtjnash

It is really cool to see how small this is! And exciting!

vtjnash avatar Sep 15 '22 19:09 vtjnash

Can we drop the implied Mmap usage and dependency

Yes :-) Already done on a WIP branch of mine, will be merged soon.

BTW, for anyone wondering what's happening here - all the development is happening over at JuliaSyntax.jl which is why this PR is just sitting around at the moment. I'll get back to this PR once I've sorted out a few remaining large usability problems (Particularly generating Expr(:incomplete) for REPL completions. Which isn't hard in itself, but has proven to be a bit of a rabbit hole leading to other things.)

c42f avatar Sep 16 '22 04:09 c42f

Now that version 0.2 of JuliaSyntax has been released, can this PR be updated to use it? We have 2 weeks left if we want to merge this in the 1.9 timeframe.

oscardssmith avatar Sep 23 '22 15:09 oscardssmith

only enabled if the JULIA_USE_NEW_PARSER=true environment variable is set.

Can it be merged, and with it on by default? Would it be good for PkgEval, or should it alternatively be run with:

$ JULIA_USE_NEW_PARSER=true julia

I'm not necessarily thinking it should be in Julia 1.9-rc1, it just think it might be good to have it from now or the feature freeze (up to then) to check, and not have to explain the ENV to users (and you will likely not get feedback from master users with a non-default option), though this should also be a possibility:

$ JULIA_USE_NEW_PARSER=false julia

PallHaraldsson avatar Sep 28 '22 11:09 PallHaraldsson

We have 2 weeks left if we want to merge this in the 1.9 timeframe.

Yep I've got julia-1.9 in mind for this. Will probably do a JuliaSyntax-0.3 release prior to that.

Can it be merged, and with it on by default

I don't think we're ready to turn this on by default. PkgEval would be good though...

c42f avatar Oct 02 '22 10:10 c42f

PkgEval would be good though...

PkgEval currently doesn't configuring environment flags, so to test this either add a commit that enables the new parser by default, or add a Make flag that results in the new parser being used by default (which can then be set using the configuration keyword to Nanosoldier's runtests, resulting in a Make.user entry).

maleadt avatar Oct 03 '22 08:10 maleadt

Tried running this under PkgEval with the new parser enabled by default (using the latest JuliaSyntax.jl, not the one from this PR), but Julia fails to bootstrap:

error during bootstrap:
LoadError("sysimg.jl", 19, LoadError("/source/usr/share/julia/stdlib/v1.9/JuliaSyntax/src/JuliaSyntax.jl", 1, LoadError("/source/usr/share/julia/stdlib/v1.9/JuliaSyntax/src/parser_api.jl", 137, LoadError("/source/usr/share/julia/stdlib/v1.9/JuliaSyntax/src/parser_api.jl", 137, UndefRefError()))))
getproperty at ./Base.jl:37 [inlined]
getindex at ./refvalue.jl:56 [inlined]
docm at ./docs/Docs.jl:521
unknown function (ip: 0x7f90895bcc8d)
_jl_invoke at /source/src/gf.c:2447 [inlined]
ijl_apply_generic at /source/src/gf.c:2629
jl_apply at /source/src/julia.h:1866 [inlined]
do_apply at /source/src/builtins.c:730
@doc at ./boot.jl:534

I'll probably add an environment option to PkgEval then.

maleadt avatar Oct 05 '22 05:10 maleadt

@nanosoldier runtests(ALL, vs = "%self", configuration = (buildflags=["DEPS_GIT=JuliaSyntax"], environment=["JULIA_USE_NEW_PARSER=true"]))

maleadt avatar Oct 05 '22 07:10 maleadt

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

nanosoldier avatar Oct 05 '22 12:10 nanosoldier

Packages that seem to occur a lot in failure logs: Macrotools, Unitful, and CpuId.

maleadt avatar Oct 05 '22 13:10 maleadt

Unitful, AbstractAlgebra, and OrdinaryDiffeq all have failures that I have filed issues about.

oscardssmith avatar Oct 05 '22 13:10 oscardssmith

In total, 8352 packages were tested, out of which 3596 succeeded, 3615 failed and 1141 were skipped.

Testing took 5 hours, 3 minutes, 31 seconds (or, sequentially, 20 days, 6 hours, 50 minutes, 52 seconds to execute 16704 package tests suites).

This seems rather bad, but comparing to some random other PkgEval, about half failing isn't unusual (for all kinds of reasons including syntax):

https://github.com/JuliaLang/julia/pull/46372#issuecomment-1268029710

I'm amazed by how long it takes (and not scalable, with package ecosystem increasing), but is the timing maybe (only slightly?) better here?

This one has timing: https://github.com/JuliaLang/julia/pull/45646#issuecomment-1264607582

PallHaraldsson avatar Oct 05 '22 13:10 PallHaraldsson

I'm amazed by how long it takes (and not scalable, with package ecosystem increasing), but is the timing maybe (only slightly?) better here?

The timings cannot be compared, as quite some packages now fail to precompile and thus never execute any test.

maleadt avatar Oct 05 '22 13:10 maleadt

It is honestly kind of silly to use PkgEval to test this, since we could just test that everything parses the same between the 2 versions (which would allow us to skip loading the code and running the tests).

oscardssmith avatar Oct 05 '22 13:10 oscardssmith

The CpuId failure is due to something like commas after new lines:

julia> [1
       , 2]
ERROR: ParseError:
Error: Expected `]`
@ REPL[33]:2:1
[1
, 2]

However, that parses fine if you use the current main branch of JuliaSyntax.jl

Krastanov avatar Oct 05 '22 15:10 Krastanov

yeah. this pr is still using 0.1 but should be using 0.2

oscardssmith avatar Oct 05 '22 16:10 oscardssmith

It is honestly kind of silly to use PkgEval to test this, since we could just test that everything parses the same between the 2 versions (which would allow us to skip loading the code and running the tests)

Using existing infrastructure to quickly get a result instead of having to write something tailor made that wouldn't have gotten any better results in the end seems to me like an efficient use of resources, not something silly.

KristofferC avatar Oct 05 '22 16:10 KristofferC

fair :). I'm coming from the perspective that we'll probably have to run this between 10 and 100 times, at which point the balance might shift.

oscardssmith avatar Oct 05 '22 16:10 oscardssmith

especially since the custom version could give more information per run since pkgeval doesn't give us errors for packages that have a failing dependency and does give us spurious errors for the thousand or so packages with bad CI

oscardssmith avatar Oct 05 '22 16:10 oscardssmith

It is honestly kind of silly to use PkgEval to test this, since we could just test that everything parses the same between the 2 versions

We actually have some scripts for this in JuliaSyntax/tools. I need to revisit those and do another run them against all of General. Would also be good to hook them up to the syntax test case reduction tools I've got over there as well. I think those scripts will be used for the bulk of the remaining compat work to come.

Regardless of those scripts, PkgEval will still be a useful tool because it tests the entire integration.

c42f avatar Oct 06 '22 03:10 c42f

Status

I've updated this PR now to vendor JuliaSyntax into Base via the deps mechanism. This seemed cleanest because it's got to be downloaded before Base is built, so it's natural to have it unpacked as a dependency of the julia-base makefile target.

This resolves the major problems I saw with exposing this as a stdlib and I think we could now merge this PR right away so that it can get into Julia 1.9.

Obviously being experimental and still having known bugs, it will remain disabled by default for now. A separate PR which removes the ENV check can be used to track progress toward full compatibility.

c42f avatar Oct 06 '22 03:10 c42f

@oscardssmith Some packages also do run-time parsing of code, so its useful to actually run PkgEval for those. But I'm with @KristofferC here, this doesn't seem worth the development time. If we were short on compute I'd add a mode to PkgEval to only Pkg.precompile (and not actually run tests), but the machine is mostly idle (a single PkgEval run without rr only takes 5 hours), so even that doesn't seem worth it.

@c42f I'll let you trigger Nanosoldier when you know there's been fixes to the JuliaSyntax.jl repo 🙂 As noted above, that doesn't need to the new parser to be the default one anymore.

maleadt avatar Oct 06 '22 05:10 maleadt

As noted above, that doesn't need to the new parser to be the default one anymore.

Perfect, thanks :-)

I'll let you trigger Nanosoldier when you know there's been fixes to the JuliaSyntax.jl repo

Well the new parser is disabled by default so I don't think we need to block this PR on bugfixes or performance when the non-default JuliaSyntax config is enabled. But perhaps I'm missing something?

For this PR, I'd like someone to review the build system and vendoring I've set up here, then I think it's harmless enough to merge this. I'm not sure who to ping for this though! Our makefiles don't seem to have a definitive owner :sweat_smile:

c42f avatar Oct 08 '22 01:10 c42f

Oh btw, the test failures here are confusing but I guess unrelated. There's a profile stdlib test failure (i686-linux-gnu), some clipboard-related (windows) and filesystem-related failures (powerpc, aarch64).

c42f avatar Oct 08 '22 01:10 c42f

Well the new parser is disabled by default so I don't think we need to block this PR on bugfixes or performance when the non-default JuliaSyntax config is enabled. But perhaps I'm missing something?

Oh for sure, I just meant as a means to improve JuliaSyntax.jl

maleadt avatar Oct 08 '22 07:10 maleadt

I'd like to know what the system image size impact of this is. If it's significant, we might need an option to exclude the code from the build.

JeffBezanson avatar Oct 12 '22 02:10 JeffBezanson

@nanosoldier runtests(ALL, vs = "%self", configuration = (buildflags=["DEPS_GIT=JuliaSyntax"], environment=["JULIA_USE_NEW_PARSER=true"]))

c42f avatar Oct 17 '22 10:10 c42f

Your job failed. cc @maleadt

nanosoldier avatar Oct 17 '22 11:10 nanosoldier

make install failed:

Oct 17 07:44:13 amdci8 bash[69705]:       From worker 2:        cp: cannot create regular file '/install/share/julia/base/JuliaSyntax/.git/objects/pack/pack-3676d1cb1e890803c88a8566efb0d7ae60442968.pack': Permission denied
Oct 17 07:44:13 amdci8 bash[69705]:       From worker 2:        cp: cannot create regular file '/install/share/julia/base/JuliaSyntax/.git/objects/pack/pack-3676d1cb1e890803c88a8566efb0d7ae60442968.idx': Permission denied
Oct 17 07:44:13 amdci8 bash[69705]:       From worker 2:        make: *** [Makefile:258: install] Error 1

maleadt avatar Oct 17 '22 12:10 maleadt