juvix
juvix copied to clipboard
Import tree
- Contributes to #2750
New commands:
-
dev import-tree scan FILE
. Scans a single file and lists all the imports in it. -
dev import-tree print
. Scans all files in the package and its dependencies. Builds an import dependency tree and prints it to stdin. If the--stats
flag is given, it reports the number of scanned modules, the number of unique imports, and the length of the longest import chain.
Example: this is the truncated output of juvix dev import-tree print --stats
in the juvix-stdlib
directory.
[...]
Stdlib/Trait/Partial.juvix imports Stdlib/Data/String/Base.juvix
Stdlib/Trait/Partial.juvix imports Stdlib/Debug/Fail.juvix
Stdlib/Trait/Show.juvix imports Stdlib/Data/String/Base.juvix
index.juvix imports Stdlib/Cairo/Poseidon.juvix
index.juvix imports Stdlib/Data/Int/Ord.juvix
index.juvix imports Stdlib/Data/Nat/Ord.juvix
index.juvix imports Stdlib/Data/String/Ord.juvix
index.juvix imports Stdlib/Prelude.juvix
Import Tree Statistics:
=======================
• Total number of modules: 56
• Total number of edges: 193
• Height (longest chain of imports): 15
Bot commands support the --scan-strategy
flag, which determines which parser we use to scan the imports. The possible values are:
-
flatparse
. It uses the low-level FlatParse parsing library. This parser is made specifically to only parse imports and ignores the rest. So we expect this to have a much better performance. It does not have error messages. -
megaparsec
. It uses the normal juvix parser and we simply collect the imports from it. -
flatparse-megaparsec
(default). It uses the flatparse backend and fallbacks to megaparsec if it fails.
Internal changes
Megaparsec Parser (Concrete.FromSource
)
In order to be able to run the parser during the scanning phase, I've adjusted some of the effects used in the parser:
- I've removed the
NameIdGen
andFiles
constraints, which were unused. - I've removed
Reader EntryPoint
. It was used to get theModuleId
. Now theModuleId
is generated during scoping. - I've replaced
PathResolver
by theTopModuleNameChecker
effect. This new effect, as the name suggests, only checks the name of the module (same rules as we had in thePathResolver
before). It is also possible to ignore the effect, which is needed if we want to use this parser without an entrypoint.
PathResolver
effet refactor
- The
WithPath
command has been removed. - New command
ResolvePath :: ImportScan -> PathResolver m (PackageInfo, FileExt)
. Useful for resolving imports during scanning phase. - New command
WithResolverRoot :: Path Abs Dir -> m a -> PathResolver m a
. Useful for switching package context. - New command
GetPackageInfos :: PathResolver m (HashMap (Path Abs Dir) PackageInfo)
, which returns a table with all packages. Useful to scan all dependencies.
The Package.PathResolver
has been refactored to be more like to normal PathResolver
. We've discussed with @paulcadman the possibility to try to unify both implementations in the near future.
Misc
-
Package.juvix
no longer ends up inPackageInfo.packageRelativeFiles
. - I've introduced string definitions for
--
,{-
and-}
. - I've fixed a bug were
.juvix.md
was detected as an invalid extension. - I've added
LazyHashMap
to the prelude. I've also addedordSet
to create ordered Sets,ordMap
for ordered maps, etc.
Benchmarks
I've profiled juvix dev import-tree --scan-strategy [megaparsec | flatparse] --stats
with optimization enabled.
In the images below we see that in the megaparsec case, the scanning takes 54.8% of the total time, whereas in the flatparse case it only takes 9.6% of the total time.
-
Megaparsec
-
Flatparse
Hyperfine
hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 20
Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats
Time (mean ± σ): 82.0 ms ± 4.5 ms [User: 64.8 ms, System: 17.3 ms]
Range (min … max): 77.0 ms … 102.4 ms 37 runs
Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats
Time (mean ± σ): 174.1 ms ± 2.7 ms [User: 157.5 ms, System: 16.8 ms]
Range (min … max): 169.7 ms … 181.5 ms 20 runs
Summary
juvix dev import-tree print --scan-strategy flatparse --stats ran
2.12 ± 0.12 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats
In order to compare (almost) only the parsing, I've forced the scanning of each file to be performed 50 times (so that the cost of other parts get swallowed). Here are the results:
hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 10
Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats
Time (mean ± σ): 189.5 ms ± 3.6 ms [User: 161.7 ms, System: 27.6 ms]
Range (min … max): 185.1 ms … 197.1 ms 15 runs
Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats
Time (mean ± σ): 5.113 s ± 0.023 s [User: 5.084 s, System: 0.035 s]
Range (min … max): 5.085 s … 5.148 s 10 runs
Summary
juvix dev import-tree print --scan-strategy flatparse --stats ran
26.99 ± 0.52 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats