pyrefly icon indicating copy to clipboard operation
pyrefly copied to clipboard

Types in Search Path Configuration Don't See Each Other

Open Mazyod opened this issue 1 month ago • 6 comments

Describe the Bug

Hello,

I have an issue when using Pyrefly where it wouldn't detect the correct type (shows Unknown) in external modules if they import a types from other modules in the search_path.

I built a repro shared fully below, but let me summarize the results with the screenshots below.

Pyrefly

Unable to detect the types of c_from_b and b_from_c:

Image

Pyright

Able to detect the types just fine:

Image

Full repo:

"""
Minimal hover-only repro for cross-package circular types between external
packages pkgb and pkgc added via search_path/extraPaths.

Layout (both pkgb and pkgc live outside the workspace root and are added via
search paths):
- pkgb/b.py: defines PkgB and holds a PkgC instance
- pkgc/c.py: defines PkgC and holds a PkgB instance

The active document instantiates PkgB and PkgC, binds a few variables, and logs
hover contents to check whether types flow across the boundary.

Run with:
    uv run python examples/pyrefly_circular_imports.py           # Pyrefly (default)
    uv run python examples/pyrefly_circular_imports.py pyright   # Pyright
"""

from __future__ import annotations

import asyncio
from pathlib import Path
from tempfile import TemporaryDirectory
from textwrap import dedent
import sys

import lsp_types
from rich.console import Console
from rich.markdown import Markdown
from lsp_types.pyrefly.backend import PyreflyBackend
from lsp_types.pyright.backend import PyrightBackend

steps = []
console = Console()

# Simple structured logging helpers
def log_step(title: str) -> None:
    steps.append(title)
    console.print(f"\n=== {title} ===")


def log_result(label: str, value) -> None:
    if label.endswith(".md"):
        console.print(f"{label}:\n")
        console.print(Markdown(value))
    else:
        console.print(f"{label}: {value}")

PKGB_B = dedent(
    """\
    from __future__ import annotations

    from pkgc.c import PkgC

    class PkgB:
        def __init__(self) -> None:
            self.c: PkgC = PkgC(self)

    """
)

PKGC_C = dedent(
    """\
    from __future__ import annotations

    from typing import TYPE_CHECKING

    if TYPE_CHECKING:
        from pkgb.b import PkgB


    class PkgC:
        def __init__(self, b: PkgB) -> None:
            self.b: PkgB = b

    """
)

ACTIVE_CODE = dedent(
    """\
    from pkgb.b import PkgB
    from pkgc.c import PkgC

    b = PkgB()
    c = PkgC(b)

    c_from_b = b.c
    b_from_c = c.b
    """
)


def prepare_workspace(pkgb_dir: Path, pkgc_dir: Path) -> None:
    """Create pkgb and pkgc packages in external paths used via search_path/extraPaths."""
    pkgb = pkgb_dir / "pkgb"
    pkgc = pkgc_dir / "pkgc"

    pkgb.mkdir(parents=True, exist_ok=True)
    pkgc.mkdir(parents=True, exist_ok=True)

    pkgb.joinpath("__init__.py").write_text("")
    pkgc.joinpath("__init__.py").write_text("")
    pkgb.joinpath("b.py").write_text(PKGB_B)
    pkgc.joinpath("c.py").write_text(PKGC_C)


async def main() -> None:
    backend_name = sys.argv[1] if len(sys.argv) > 1 else "pyrefly"
    if backend_name == "pyright":
        backend = PyrightBackend()
        options_key = "extraPaths"
    else:
        backend = PyreflyBackend()
        options_key = "search_path"

    with TemporaryDirectory(prefix="pyrefly-circular-root-") as tmp_root, TemporaryDirectory(
        prefix="pyrefly-circular-pkgb-"
    ) as tmp_pkgb, TemporaryDirectory(prefix="pyrefly-circular-pkgc-") as tmp_pkgc:
        root = Path(tmp_root)
        external_pkgb = Path(tmp_pkgb)
        external_pkgc = Path(tmp_pkgc)
        prepare_workspace(external_pkgb, external_pkgc)

        session = await lsp_types.Session.create(
            backend,
            base_path=root,
            initial_code=ACTIVE_CODE,
            options={options_key: [str(external_pkgb), str(external_pkgc)]},
        )

        try:
            log_step("Diagnostics for active document")
            diagnostics = await session.get_diagnostics()
            log_result("Diagnostics count", len(diagnostics))
            if diagnostics:
                log_result("Diagnostics", diagnostics)

            hover_targets = {
                "b (PkgB)": lsp_types.Position(line=3, character=0),
                "c (PkgC)": lsp_types.Position(line=4, character=0),
                "c_from_b (PkgC)": lsp_types.Position(line=6, character=0),
                "b_from_c (PkgB)": lsp_types.Position(line=7, character=0),
            }

            for label, position in hover_targets.items():
                log_step(f"Hover: {label}")
                hover = await session.get_hover_info(position)
                if hover:
                    match hover["contents"]:
                        case {"kind": "markdown", "value": value}:
                            log_result("Hover contents.md", value)
                        case contents:
                            log_result("Hover contents", contents)
        finally:
            await session.shutdown()


if __name__ == "__main__":
    asyncio.run(main())

Mazyod avatar Nov 26 '25 08:11 Mazyod

Need to refine my issue report.

Mazyod avatar Nov 26 '25 09:11 Mazyod

Refined the issue to be minimal and as clear as possible.

Mazyod avatar Nov 26 '25 09:11 Mazyod

Thanks for the bug report! The minimization is very helpful

For whoever picks this up: there's another open issue on package / subpackage behavior: https://github.com/facebook/pyrefly/issues/1663

I don't know whether the two bugs are related per se, but they likely at least involve the same import resolution logic. And I've confirmed that this bug does not repro without packages - see this sandbox demo with just a.py and b.py, which type-checks just fine.

stroxler avatar Nov 26 '25 15:11 stroxler

Okay, I think I see what the problem here is - the setup in the script creates three temp directories, one each for the root and the two packages. The root points at the packages.

I think the problem here is that because it is designed to handle editor sessions that span multiple projects (and in particular to handle massive monorepos in a single session) Pyrefly generally treats each file as analyzed using the configuration associated with that file.

When the packages are in unrelated directories, this means the root can find both b and c (because they are on its search path), but b and c (which are type checked with a default configuration) cannot see one another.

In most projects, this works out okay because code is typically part of the project - and so "covered" using our heuristics by a pyrefly.toml at the project root, so that all the dependencies are analyzed with the same config as the actual project. But the layout your script creates doesn't have this properly.

Is your actual project set up this way? It might be worth understanding what the layout is and what your options are for configuring Pyrefly correctly

stroxler avatar Nov 27 '25 16:11 stroxler

Thanks for the reply.

My project is indeed set up this way. I have a "codegen" system that writes generated code in some temp location, and the main library code uses types from the generated code. Finally, my actual editor utilizes the library, and the generated types are "Unknown".

I ended up with a work around, where I copy the library code next to the codegen output, so it's all in one search path, and that works. However, wanted to share this case and discrepancy between pyright and pyrefly behavior for record purposes, if anything, or if there is indeed a case to support this.

Mazyod avatar Nov 28 '25 13:11 Mazyod

Another option given our existing limitations might be to generate a pyrefly.toml that points back to the original project in its search path, that might be easier than copying the code - the effect is similar, but some IDE features for example will likely work better if there's no copying involved

cc @connernilsen it might be worth thinking some more about this

stroxler avatar Dec 01 '25 17:12 stroxler