tools icon indicating copy to clipboard operation
tools copied to clipboard

feat(rome_fs): Add support for symbolic links (symlinks)

Open realtimetodie opened this issue 1 year ago • 7 comments

Adds support for symbolic links.

Features

  • Protects users against symbolic link cycles or symbolic link infinite expansions.
  • Rome will not stop when a symbolic link cycle or symbolic link infinite expansion is encountered, but emit a diagnostic error and continue with the next file entry during directory traversal.
  • Emits helpful diagnostic errors when a symbolic link cycle or a symbolic link infinite expansion is encountered during directory traversal.
  • Emits helpful diagnostic errors when a dereferenced (broken) symbolic link is encountered during directory traversal.

Implementation

The implementation is minimal, optimized for speed and stores the least amount of data possible. It replaces the AtomicInterner with an updated version of the IndexedInterner and stores the path in the cache, to protect Rome users when a symbolic link cycle or a symbolic link infinite expansion is encountered during directory traversal. Verbose diagnostic errors are emitted, to better understand the underlying issue. The implementation also ensures, that only the translated paths and not the paths to symbolic links are cached.

Due to the minimal implementation, the error diagnostic errors are not perfect, meaning that they do not report the exact location of the symbolic link, but the location of the duplicated path in the IndexedInterner storage cache. For perfect diagnostic errors, we would need to store much more information, preferably the entire path resolution tree. Storing the entire path resolution tree would increase the memory usage and decrease speed.

In general, symbolic link cycles or symbolic link infinite expansions should be seen as rare edge-cases that are caused by errors in the user configuration. The primary aim of this feature is to add support for symbolic links.

Motivation

I encountered the missing support for symbolic links while working on the Bazel build tool rules for Rome

https://github.com/diceride/rules_rome

In order to use Rome with Bazel, support for symbolic links is required. Symbolic links are a core foundation of how the Bazel build system works. Bazel works with symbolic links and executes actions in a sandboxed environment and Bazel has built-in protection against symbolic link cycles and symbolic link infinite expansions. A Bazel execution will stop, when a symbolic link cycle or symbolic link infinite expansion is encountered.

Diagnostic errors

  • DereferencedSymlink error

    This diagnostic error will be emitted when a dereferenced (broken) symbolic link is encountered during directory traversal.

    Example - Error message when traversing a test directory /tmp/testdir that contains a dereferenced (broken) symbolic link (broken_symlink)

$ rome format /tmp/testdir
/tmp/testdir/broken_symlink internalError/fs ━━━━━━━━━━

⚠ Dereferenced symlink
  
  ℹ Rome encountered a file system entry that is a broken symbolic link: broken_symlink
  • InfiniteSymlinkExpansion error

    This diagnostic error will be emitted when a symbolic link cycle or a symbolic link infinite expansion is encountered during directory traversal.

    Example - Error message when traversing a test directory /tmp/testdir that contains a symbolic link to itself (self_symlink)

$ rome format /tmp/testdir
/tmp/testdir/self_symlink internalError/fs ━━━━━━━━━━

  ⚠ Infinite symlink expansion
  
  ✖ Rome encountered a file system entry that leads to an infinite symbolic link expansion, causing an infinite cycle: /tmp/testdir/self_symlink
  • ELOOP error (Unix-only)

    fs::read_link will emit the ELOOP diagnostic error on Unix, when too many symbolic links are encountered while translating the pathname ("Too many levels of symbolic links").

References

Related

  • https://github.com/rome/tools/issues/2383

@leops

realtimetodie avatar Nov 13 '22 18:11 realtimetodie

Deploy Preview for docs-rometools canceled.

Built without sensitive environment variables

Name Link
Latest commit e94b8cbbfb2d68bd6573af76e41cc57b240a3aa2
Latest deploy log https://app.netlify.com/sites/docs-rometools/deploys/63755320b41e4500092c7b23

netlify[bot] avatar Nov 13 '22 18:11 netlify[bot]

Oh wow. This is awesome. Could you run the benchmarks in the benchmark directory once with and without your changes and post them here to understand the impact of the change

MichaReiser avatar Nov 13 '22 19:11 MichaReiser

Oh wow, Bazel rules for Rome, really cool

jeysal avatar Nov 13 '22 20:11 jeysal

I'll leave the review to someone like @leops familiar with this code, but I'm surprised how simple this seems — I never got personally much involved with the symlinks topic at Jest but know it took years to land even partial support, and then there's legendary issues like https://github.com/facebook/metro/issues/1 :smile:

jeysal avatar Nov 13 '22 20:11 jeysal

Benchmarks results before and after, taken on an AMD 8 core x86 CPU.

Before (prettier)

Summary
'Rome' ran
    2.20 ± 0.05 times faster than 'dprint'
    4.02 ± 0.09 times faster than 'Rome (1 thread)'
   26.89 ± 0.80 times faster than 'Parallel-Prettier'
   33.44 ± 0.97 times faster than 'Prettier'

After (prettier)

Summary
  'Rome' ran
    2.18 ± 0.06 times faster than 'dprint'
    3.99 ± 0.10 times faster than 'Rome (1 thread)'
   26.73 ± 0.87 times faster than 'Parallel-Prettier'
   33.31 ± 1.03 times faster than 'Prettier'

Before (eslint)

Summary
  'Rome' ran
    3.83 ± 0.11 times faster than 'Rome (1 thread)'
   21.27 ± 0.68 times faster than 'ESLint'

After (eslint)

Summary
  'Rome' ran
    3.81 ± 0.12 times faster than 'Rome (1 thread)'
   21.15 ± 0.70 times faster than 'ESLint'

As described above, the only difference is that the IndexedInterner is not atomic anymore and stores the hashed path using std::hash in an IndexSet cache now. The IndexSet itself is accessed using a std::sync::RwLock.

realtimetodie avatar Nov 13 '22 21:11 realtimetodie

I almost forgot to mention: Some tests are now running outside of the emulated in-memory file system, as we need access a real file system in order to test the symbolic link translation using the std::fs::read_link method.

realtimetodie avatar Nov 13 '22 21:11 realtimetodie

  • Updated the comments.
  • Added tests for Windows.
  • The FileSystemDiagnostic error enum now takes simple String instead of a PathBuf, to follow the general coding style.

realtimetodie avatar Nov 15 '22 21:11 realtimetodie

I resolved the merge conflicts.

realtimetodie avatar Nov 16 '22 21:11 realtimetodie