pyrefly feat: add workspace diagnostic mode support (Issue #397)

Summary

Implements workspace diagnostic mode for Pyrefly, allowing users to see type errors from all files in the workspace, not just open files (similar to Pyright's diagnosticMode setting).

Closes #397

Changes

Backend

Wired up the existing DiagnosticMode enum (was marked as dead code)
Modified get_diag_if_shown() to respect diagnostic mode setting
Updated publish closure to use get_all_errors() in workspace mode
Uses incremental transaction errors (no full rechecks)

Frontend & Tests

Added python.analysis.diagnosticMode setting to VS Code extension
Updated extension.ts to monitor python.analysis configuration changes
Added 3 LSP interaction tests:
- test_workspace_mode_uses_get_all_errors
- test_open_files_only_mode_filters_correctly
- test_default_mode_is_open_files_only
Added test files for workspace diagnostic mode scenarios

Configuration

Users can now choose between:

'openFilesOnly' (default): Show type errors only in open files
'workspace': Show type errors in all files within the workspace

(I have shared a demo video in discord in dev channel)

Oct 29 '25 06:10 AryanBagade

@kinto0 has imported this pull request. If you are a Meta employee, you can view this in D85782844.

Oct 29 '25 18:10 meta-codesync[bot]

I've implemented the pull model (where unopened files are analyzed when textDocument/diagnostic is explicitly requested for them, which is what the tests do).

However, I noticed during manual testing that VS Code doesn't automatically show errors from unopened files. Looking at the LSP server output logs, I can see that unopened files ARE being analyzed (logs show "Prepare to check 2 files" and "Populated all files in the project path"), but their diagnostics aren't being published to VS Code's Problems panel because VS Code doesn't request diagnostics for them via textDocument/diagnostic, it only requests diagnostics for open files.

I attempted to implement the push model (adding workspace files to handles in validate_in_memory_for_possibly_committable_transaction so their diagnostics get published via publishDiagnostics), but the LSP server hangs/crashes during initialization even with just 2 files in the workspace.

I've tried several approaches:

Adding workspace files to handles and calling transaction.run() - crashes
Adding workspace files without calling run() (relying on populate_all_project_files_in_config) - still crashes
Using get_all_errors() - works but shows dependency errors (e.g., collections.Counter.__init__)

Oct 30 '25 08:10 AryanBagade

I've implemented the pull model (where unopened files are analyzed when textDocument/diagnostic is explicitly requested for them, which is what the tests do).

I think it's necessary to keep our push model for a few reasons:

Pyrefly is the source of truth for what has "changed": we only want to recheck when the actual contents change and the naive file watcher will do extra work.
Pyrefly will need to error dependent files when a dependency updates, which is not something the textdocument/diagnostic supports

However, I noticed during manual testing that VS Code doesn't automatically show errors from unopened files. Looking at the LSP server output logs, I can see that unopened files ARE being analyzed (logs show "Prepare to check 2 files" and "Populated all files in the project path"), but their diagnostics aren't being published to VS Code's Problems panel because VS Code doesn't request diagnostics for them via textDocument/diagnostic, it only requests diagnostics for open files.

This is the case when we need to publishDiagnostics, which your code in validate_in_memory_for_possibly_committable_transaction should handle.

I attempted to implement the push model (adding workspace files to handles in validate_in_memory_for_possibly_committable_transaction so their diagnostics get published via publishDiagnostics), but the LSP server hangs/crashes during initialization even with just 2 files in the workspace.

I've tried several approaches:

Adding workspace files to handles and calling transaction.run() - crashes

Adding workspace files without calling run() (relying on populate_all_project_files_in_config) - still crashes

Using get_all_errors() - works but shows dependency errors (e.g., collections.Counter.__init__)

let's discuss over discord

thanks for all the hard work! this is a fun one

Oct 30 '25 16:10 kinto0

All of the tests seem to be based on document_diagnostic: but doesn't actually test the behavior, since the language server operates on publishDiagnostics in the IDE. you should remove the document_diagnostic changes and make the tests test for publishDiagnostics.

Oct 30 '25 21:10 kinto0

test now looks good, but you still have the logic change for document_diagnostics (the function and the code in as_request::<DocumentDiagnosticRequest>)

Removed the mode filtering logic from the DocumentDiagnosticRequest handler.

Oct 31 '25 18:10 AryanBagade

Thanks for the review and suggestions! Would you like me to handle the follow-up items (documentation + renaming to "project" mode) in separate PRs????

Oct 31 '25 21:10 AryanBagade

Thanks for the review and suggestions! Would you like me to handle the follow-up items (documentation + renaming to "project" mode) in separate PRs????

sure, if you want, that'd be great!

On a related note, I'm noticing a few more things when I take a closer look and testing and I'm wondering if it will simplify the code a lot:

we have a few redundant checks for whether workspaces include this config or not and which handles to run the transaction on
every time we validate_in_memory_for_transaction we likely need to check these additional handles from the config

I'm wondering if we can simplify it by moving all the logic about which handles to run into this function:

    /// Run the transaction with the in-memory content of open files. Returns the handles of open files when the transaction is done.
    fn validate_in_memory_for_transaction(
        &self,
        state: &State,
        open_files: &RwLock<HashMap<PathBuf, Arc<String>>>,
        transaction: &mut Transaction<'_>,
    ) -> Vec<Handle> {
        let handles = open_files
            .read()
            .keys()
            .flat_map(|x| {
                self.workspaces.get_with(x, |(_, w)| {
                    let handle = make_open_handle(state, x);
                    if (w.lsp_analysis_config.is_some_and(|c| {
                        matches!(c.diagnostic_mode, Some(DiagnosticMode::Workspace))
                    })) {
                        let config = self
                            .state
                            .config_finder()
                            .python_file(handle.module(), handle.path());
                        // get all files in config and add them here
                        vec![handle]
                    } else {
                        vec![handle]
                    }
                })
            })
            .collect::<Vec<_>>();
        transaction.set_memory(
            open_files
                .read()
                .iter()
                .map(|x| (x.0.clone(), Some(x.1.dupe())))
                .collect::<Vec<_>>(),
        );
        transaction.run(&handles, Require::Everything);
        handles
    }

then validate_in_memory_for_possibly_committable_transaction can stay very similar to how it originally was

but i'm dealing with lifetime issues.... I think it's worth another look as it might help with clarity and hard-to-discover bugs. I will have more time next week to look into it.

Oct 31 '25 22:10 kinto0

thanks, hmm! ic what you mean about centralizing the logic. Happy to help with the refactor if needed, Let me know!!!!

Oct 31 '25 22:10 AryanBagade

I'm sorry about this falling through this week. There's a few things that have to happen:

merge conflicts
centralize logic in one place

I attempted this but ran into the lifetime issue described above. I also made a few changes while I was messing around (shown at bottom). I'm happy to set up time next week if you want to work together synchronously, or I can take another look / work with you next week on discord.

From c3f705ac3bdcc7aa9aedb41910479a7ab693662a Mon Sep 17 00:00:00 2001
From: Kyle Into <[email protected]>
Date: Fri, 7 Nov 2025 10:06:29 -0800
Subject: [PATCH] suggestions

Differential Revision: D86536062

fbshipit-source-id: c3f705ac3bdcc7aa9aedb41910479a7ab693662a


---

diff --git a/pyrefly/lib/lsp/non_wasm/server.rs b/pyrefly/lib/lsp/non_wasm/server.rs
--- a/pyrefly/lib/lsp/non_wasm/server.rs
+++ b/pyrefly/lib/lsp/non_wasm/server.rs
@@ -924,7 +924,6 @@
                         self.validate_in_memory_and_commit_if_possible(ide_transaction_manager);
                         let transaction =
                             ide_transaction_manager.non_committable_transaction(&self.state);
-
                         self.send_response(new_response(
                             x.id,
                             Ok(self.document_diagnostics(&transaction, params)),
@@ -1081,6 +1080,7 @@
 
     /// Run the transaction with the in-memory content of open files. Returns the handles of open files when the transaction is done.
     fn validate_in_memory_for_transaction(
+        &self,
         state: &State,
         open_files: &RwLock<HashMap<PathBuf, Arc<String>>>,
         transaction: &mut Transaction<'_>,
@@ -1088,7 +1088,23 @@
         let handles = open_files
             .read()
             .keys()
-            .map(|x| make_open_handle(state, x))
+            .flat_map(|x| {
+                self.workspaces.get_with(x.to_path_buf(), |(_, w)| {
+                    let handle = make_open_handle(state, x);
+                    if (w.lsp_analysis_config.is_some_and(|c| {
+                        matches!(c.diagnostic_mode, Some(DiagnosticMode::Workspace))
+                    })) {
+                        let config = self
+                            .state
+                            .config_finder()
+                            .python_file(handle.module(), handle.path());
+                        // get all files in config and add them here
+                        vec![handle]
+                    } else {
+                        vec![handle]
+                    }
+                })
+            })
             .collect::<Vec<_>>();
         transaction.set_memory(
             open_files
@@ -1124,27 +1140,20 @@
                 .config_finder()
                 .python_file(ModuleName::unknown(), e.path());
 
-            // Get diagnostic mode for this file's workspace
             let diagnostic_mode = self.workspaces.get_diagnostic_mode(&path);
-
-            // File must be in project (not excluded) to show diagnostics
             let is_in_project =
                 config.project_includes.covers(&path) && !config.project_excludes.covers(&path);
 
-            // Then check based on diagnostic mode
             let is_open = open_files.contains_key(&path);
             let should_show = match diagnostic_mode {
-                // Workspace mode: show if in project (open or closed files)
                 DiagnosticMode::Workspace => is_in_project,
-                // OpenFilesOnly mode: show if open AND in project
+                // OpenilesOnly mode: show if open AND in project
                 DiagnosticMode::OpenFilesOnly => is_open && is_in_project,
-            };
+            } && self
+                .type_error_display_status(e.path().as_path())
+                .is_enabled();
 
-            if should_show
-                && self
-                    .type_error_display_status(e.path().as_path())
-                    .is_enabled()
-            {
+            if should_show {
                 return Some((path.to_path_buf(), e.to_diagnostic()));
             }
         }
@@ -1230,86 +1239,21 @@
             Err(transaction) => transaction,
         };
         let handles =
-            Self::validate_in_memory_for_transaction(&self.state, &self.open_files, transaction);
-
-        // Check if any workspace is in workspace diagnostic mode
-        let has_workspace_mode = self.workspaces.roots().iter().any(|root| {
-            matches!(
-                self.workspaces.get_diagnostic_mode(root),
-                DiagnosticMode::Workspace
-            )
-        });
-
-        // In workspace mode, analyze all project files so get_all_errors() includes unopened files
-        if has_workspace_mode {
-            let open_file_paths: std::collections::HashSet<_> =
-                self.open_files.read().keys().cloned().collect();
-            if let Some(first_open_file) = open_file_paths.iter().next() {
-                let module_path = ModulePath::filesystem(first_open_file.clone());
-                let config = self
-                    .state
-                    .config_finder()
-                    .python_file(ModuleName::unknown(), &module_path);
-                let project_path_blobs = config.get_filtered_globs(None);
-                if let Ok(paths) = project_path_blobs.files() {
-                    let project_handles: Vec<_> = paths
-                        .into_iter()
-                        .filter_map(|path| {
-                            // Skip files that are already open (already in handles)
-                            if open_file_paths.contains(&path) {
-                                return None;
-                            }
-                            let module_path = ModulePath::filesystem(path.clone());
-                            let path_config = self
-                                .state
-                                .config_finder()
-                                .python_file(ModuleName::unknown(), &module_path);
-                            if config == path_config {
-                                Some(handle_from_module_path(&self.state, module_path))
-                            } else {
-                                None
-                            }
-                        })
-                        .collect();
-                    // Analyze only for errors, not full indexing
-                    transaction.run(&project_handles, Require::Errors);
-                }
-            }
-        }
+            self.validate_in_memory_for_transaction(&self.state, &self.open_files, transaction);
 
         let publish = |transaction: &Transaction| {
             let mut diags: SmallMap<PathBuf, Vec<Diagnostic>> = SmallMap::new();
             let open_files = self.open_files.read();
-
-            // Pre-populate with empty arrays for all open files to ensure we send
-            // publishDiagnostics notifications even when errors are cleared
             for x in open_files.keys() {
                 diags.insert(x.as_path().to_owned(), Vec::new());
             }
-
-            // In workspace mode, use get_all_errors() to get errors from all project files.
-            // In open-files-only mode, use get_errors(&handles) to only get errors from open files.
-            // The filtering by diagnostic mode and project includes/excludes is handled in get_diag_if_shown.
-            let errors = if has_workspace_mode {
-                transaction.get_all_errors()
-            } else {
-                transaction.get_errors(&handles)
-            };
-
-            for e in errors.collect_errors().shown {
+            for e in transaction.get_errors(&handles).collect_errors().shown {
                 if let Some((path, diag)) = self.get_diag_if_shown(&e, &open_files) {
                     diags.entry(path.to_owned()).or_default().push(diag);
                 }
             }
-
             for (path, diagnostics) in diags.iter_mut() {
-                // Use appropriate handle type: memory handle for open files, filesystem for others
-                let is_open = open_files.contains_key(path);
-                let handle = if is_open {
-                    make_open_handle(&self.state, path)
-                } else {
-                    handle_from_module_path(&self.state, ModulePath::filesystem(path.clone()))
-                };
+                let handle = make_open_handle(&self.state, path);
                 Self::append_unreachable_diagnostics(transaction, &handle, diagnostics);
             }
             self.connection.publish_diagnostics(diags);
@@ -1354,7 +1298,6 @@
             self.open_files.dupe(),
         );
     }
-
     fn invalidate_find_for_configs(&self, invalidated_configs: SmallSet<ArcId<ConfigFile>>) {
         self.invalidate(|t| t.invalidate_find_for_configs(invalidated_configs));
     }
@@ -1443,7 +1386,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             f(transaction.as_mut());
 
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
 
             // Commit will be blocked until there are no ongoing reads.
             // If we have some long running read jobs that can be cancelled, we should cancel them
@@ -1689,7 +1632,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             transaction.as_mut().set_memory(vec![(uri, None)]);
             let _ =
-                Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+                self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
             state.commit_transaction(transaction);
             queue_source_db_rebuild_and_recheck(
                 state.dupe(),
@@ -1991,7 +1934,7 @@
             cancellation_handles
                 .lock()
                 .insert(request_id.clone(), transaction.get_cancellation_handle());
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
             match transaction.find_global_references_from_definition(
                 handle.sys_info(),
                 metadata,
@@ -2498,7 +2441,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             transaction.as_mut().invalidate_config();
 
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
 
             // Commit will be blocked until there are no ongoing reads.
             // If we have some long running read jobs that can be cancelled, we should cancel them

--
1.7.9.5

Nov 07 '25 21:11 kinto0

I'm sorry about this falling through this week. There's a few things that have to happen:

merge conflicts

centralize logic in one place

I attempted this but ran into the lifetime issue described above. I also made a few changes while I was messing around (shown at bottom). I'm happy to set up time next week if you want to work together synchronously, or I can take another look / work with you next week on discord.

hmm ic ic, sure will resolve the merge conflict no worries i'll deep dive into it and will try to centralize the code this weekend and yeah we can work next week, will let you know on discord if i stuck in something!

Nov 07 '25 21:11 AryanBagade

This pull request has been automatically marked as stale because it has not had recent activity for more than 2 weeks.

If you are still working on this this pull request, please add a comment or push new commits to keep it active. Otherwise, please unassign yourself and allow someone else to take over.

Thank you for your contributions!

Dec 02 '25 00:12 github-actions[bot]