vscode icon indicating copy to clipboard operation
vscode copied to clipboard

Issue with GitHub Copilot on large Java files (~5000 lines)

Open yuriipelekh opened this issue 8 months ago • 8 comments

Does this issue occur when all extensions are disabled?: Yes

VS Code Version: 1.99.1 (Universal)

OS Version: macOS 15.4 (24E248)

Steps to Reproduce:

Open a large Java file (~5000 lines of legacy code).

Attempt to use GitHub Copilot to refactor the file with a comprehensive prompt aiming to reduce code duplication, follow the Single Responsibility Principle, and group related functionalities.

Expected Behavior:

GitHub Copilot completes the refactoring or provides a meaningful suggestion.

Actual Behavior:

GitHub Copilot initiates processing but repeatedly restarts without successfully completing the refactoring action.

Additional Information:

I suspect this issue might be due to Copilot’s context window limitations or output token constraints, especially given the file size. Could you please clarify the limitation and recommend approaches for effectively using GitHub Copilot on large legacy files?

Thank you!

yuriipelekh avatar Apr 10 '25 16:04 yuriipelekh

Thanks for creating this issue! It looks like you may be using an old version of VS Code, the latest stable release is 1.99.2. Please try upgrading to the latest version and checking whether this issue remains.

Happy Coding!

I updated the VS Code (and GitHub Copilot extension) to the latest version, but it didn't help.

yuriipelekh avatar Apr 10 '25 16:04 yuriipelekh

/gifPlease

connor4312 avatar Apr 10 '25 17:04 connor4312

Thanks for reporting this issue! Unfortunately, it's hard for us to understand what issue you're seeing. Please help us out by providing a screen recording showing exactly what isn't working as expected. While we can work with most standard formats, .gif files are preferred as they are displayed inline on GitHub. You may find https://gifcap.dev helpful as a browser-based gif recording tool.

If the issue depends on keyboard input, you can help us by enabling screencast mode for the recording (Developer: Toggle Screencast Mode in the command palette). Lastly, please attach this file via the GitHub web interface as emailed responses will strip files out from the issue.

Happy coding!

Does it have anything to do with the context length?🤔

iwangbowen avatar Apr 11 '25 01:04 iwangbowen

Does it have anything to do with the context length?🤔

@iwangbowen What are the general recommendations for context length (for example, 500 lines of code)? What are the best practices for working with large files? I mean, is it still true that I, as a software engineer, still need to think about splitting this large file into smaller classes (files) and then applying refactoring with GH Copilot to these smaller files?

yuriipelekh avatar Apr 11 '25 06:04 yuriipelekh

Does it have anything to do with the context length?🤔

@iwangbowen What are the general recommendations for context length (for example, 500 lines of code)? What are the best practices for working with large files? I mean, is it still true that I, as a software engineer, still need to think about splitting this large file into smaller classes (files) and then applying refactoring with GH Copilot to these smaller files?

Dividing code into multiple files is a recommended practice for better organization and maintainability. This approach is particularly beneficial when working with GitHub Copilot, as it significantly reduces the time spent streaming the entire codebase when only minor changes are needed in specific components.

iwangbowen avatar Apr 12 '25 12:04 iwangbowen

Hi team,

Some time ago, I reported an issue where GitHub Copilot would get stuck during large-file refactoring. I want to follow up with an update.

I no longer face the issue of Copilot getting "stuck" per se - it technically works now - but the quality of the refactoring results is extremely poor. The generated code doesn't compile, includes logical errors, or makes naive changes that break the original logic. This makes Copilot practically unusable for serious refactoring of large or legacy code files.

Let me give you a bit more context. I work with many outdated projects burdened with legacy code and significant technical debt. We’re actively working on refactoring and re-engineering these codebases, which span across:

  • Large Java files (often >6500 LOC).
  • Legacy React (JS) codebases being migrated to modern React + TypeScript.
  • Old .NET Framework systems transitioning to modern platforms like .NET Core or even Java.

Based on this experience, I believe Copilot (and similar AI assistants) need better integration with IDE internals and refactoring tools. For example:

  • Let Copilot invoke IDE-native refactoring tools (e.g., symbol renaming or class extraction that rely on AST).
  • Create a copy of the original file, allow Copilot to break large file into smaller, context-aware chunks, and only remove the original file once the refactoring is complete and verified.
  • Integrate with MCP servers (like mcp-ts-morph or mcp-jetbrains) for AST-based, deterministic edits.

I’d love to know:

  • Are there any ongoing efforts in the Copilot/VS Code ecosystem that tackle these large-scale refactoring issues more intelligently?
  • Can you point me to resources, samples, or companies that have successfully used Copilot to handle large-scale or legacy codebase transformations?
  • Are there plans to allow Copilot to delegate tasks to IDE-native capabilities (e.g., expose rename, extract method, etc. via MCP)?

I’ve done some of my own research, but I’d really appreciate the team’s insights on how to best approach these challenges - or if there are collaborative initiatives I could contribute to.

Thanks again for your great work and looking forward to your thoughts!

yuriipelekh avatar May 22 '25 14:05 yuriipelekh