ruff icon indicating copy to clipboard operation
ruff copied to clipboard

Add support for caret repositioning with the formatter

Open KotlinIsland opened this issue 11 months ago • 20 comments

Probably would apply to the --fixer as well.

[
    
]
> ruff format test.py --caret-position 2:5
new caret position: 1:2
1 file reformatted

This is desired for IDE integration:

  • PyCharm: https://github.com/koxudaxi/ruff-pycharm-plugin/issues/388

KotlinIsland avatar Mar 01 '24 06:03 KotlinIsland

Thank you for reporting this issue. I agree, the caret moving is very annoying. We should fix this. However, I first want to understand the JetBrains plugin better because, for example, VS Code handles this automatically and I would expect the same to be true for JetBrains.

What I understand is that the formatting happens in RuffFormatProcessor. It inherits from RuffPostFormatProcessor that itself inherits from PostFormatProcessor which, unfortunately, is undocumented.

I'm not sure if that's the right infrastructure for formatting, considering that JetBrains' external formatter example and their official rust plugin when using rustfmt use AsyncDocumentFormattingService.

The official Rust plugin does use FormatProcessor implementations for their custom formatter that is based on JetBrains formatter framework. But the ruff plugin doesn't reimplement the formatting logic based on their formatting infrastructure, what it does is call into ruff directly. That's why I think it isn't the right infrastructure.

@koxudaxi sorry for pinging you directly. Could you share some background on the plugin and help us understand if using AsyncDocumentFormattingService for formatting would be an option?

Unrelated to the issue: This might also be interesting. I believe it is how the Rust plugin integrates the external linter (cargo/clippy) https://github.com/intellij-rust/intellij-rust/blob/c6657c02bb62075bf7b7ceb84d000f93dda34dc1/src/main/kotlin/org/rust/ide/inspections/RsExternalLinterInspection.kt#L114

MichaReiser avatar Mar 01 '24 07:03 MichaReiser

@MichaReiser

Thank you for thinking about the problem.

What I understand is that the formatting happens in RuffFormatProcessor. It inherits from RuffPostFormatProcessor that itself inherits from PostFormatProcessor which, unfortunately, is undocumented.

I'm not sure if that's the right infrastructure for formatting, considering that JetBrains' external formatter example and their official rust plugin when using rustfmt use AsyncDocumentFormattingService.

I selected the PostFormatProcessor because, intellij-blackconnect uses the PostFormatProcessor

This was a small start way of trying to create a ruff in the same way as the black formatter we were familiar with. Also, this black plugin was also contributed by the PyCharm team leader at the time, so I thought it had been reviewed to some extent.

I will reimplement it AsyncDocumentFormattingService asap, as it appears to be appropriate for using external commands as you suggest. Thank you for your investigation. I just don't know if this change will solve the caret position problem. I will report here as soon as possible.

koxudaxi avatar Mar 01 '24 08:03 koxudaxi

@koxudaxi oh that's interesting. The blackconnect also seems to have the CodeReformatter. Do you know what it's used for?

Black doesn't support carret positioning either. I wonder if their plugin has the same issue.

I will reimplement it AsyncDocumentFormattingService asap, as it appears to be appropriate for using external commands as you suggest. Thank you for your investigation.

Thank you. I hope I'm not wasting your time. I have never written a JetBrains plugin, what i shared is only based on existing code that I found.

MichaReiser avatar Mar 01 '24 09:03 MichaReiser

@MichaReiser

The blackconnect also seems to have the CodeReformatter. Do you know what it's used for?

This is a service to operate the black. When called from each entry point in IDEA, this service will format the black through it.

Thank you. I hope I'm not wasting your time. I have never written a JetBrains plugin, what i shared is only based on existing code that I found.

I am developing several OSS projects in my personal time, but obviously, I don't have enough time. I'll figure out a time to generate some time ASAP. 😅 Thank you for your help.

koxudaxi avatar Mar 01 '24 09:03 koxudaxi

@koxudaxi I can also try to PR if that helps and you're open to it. Or maybe @snowsignal can help. She's focusing on our editor integration.

MichaReiser avatar Mar 01 '24 09:03 MichaReiser

It's probably worth mentioning that I'm not against adding a --carret or --cursor option if needed. I know Prettier supports it as well. The main reason why I hope that editors solve this problem for us is that I think they can do it more accurately than ruff, at least with the infrastructure it has today.

What ruff could support is to track the cursor position and then map it to the start or end of the next node. For example:

self.my_field_name_that_is_long.more
               ^

where ^ marks the cursor position. The best Ruff could do is to remap the position to the end of my_field_name_that_is_long, which is better than what is happening in the summary of this issue, but I think it is still not ideal. A better solution would probably have to do some syntax aware-diffing to find the updated cursor position. And that's what I hope editors are doing for us

MichaReiser avatar Mar 01 '24 11:03 MichaReiser

@MichaReiser @KotlinIsland I have created the PR to check AsyncDocumentFormattingService. I think the service doesn't chase the caret :( https://github.com/koxudaxi/ruff-pycharm-plugin/pull/391 Before

[
    <CARET>
]
print()

After

[]<CARET>
print()

I will check again in a few hours, as I don't have time right now.

koxudaxi avatar Mar 01 '24 12:03 koxudaxi

@koxudaxi I played a bit with the plugin and noticed that it works as expected when:

  • I use the replaceAndCommit from blackconnect`
    
      private fun PsiFile.replaceAndCommit(range: TextRange, s: String) {
          val documentManager = PsiDocumentManager.getInstance(project)
          val document = documentManager.getDocument(this) ?: return
          documentManager.doPostponedOperationsAndUnblockDocument(document)
          try {
              document.replaceString(range.startOffset, range.endOffset, s)
          } finally {
              documentManager.commitDocument(document)
          }
      }
    
  • When I replace the entire source
            if (text == formatted) return rangeToReformat
          val sourceDiffRange = diffRange(text, formatted) ?: return rangeToReformat
    
          source.replaceAndCommit(TextRange.create(0, source.textLength), formatted);
    
          return rangeToReformat;
    

This makes me wonder if there's something off with the range calculation that drips Kotlin

MichaReiser avatar Mar 01 '24 12:03 MichaReiser

@MichaReiser I was thinking the same things when I took a shower Last time, I changed the .replaceString to another one. I will recheck it.

https://github.com/koxudaxi/ruff-pycharm-plugin/commit/95baf5d352404d5743b4f46d1552b7311d36c45b#diff-3946523917f9e1e16732de91e7e38c62106c7e1f0daa025f82652e6d72ed48c8R33-R34

koxudaxi avatar Mar 01 '24 13:03 koxudaxi

@koxudaxi I see you're on top of it, and I am sorry for misleading you in the wrong direction. Feel free to let me know if there's anything I can help you with.

MichaReiser avatar Mar 01 '24 14:03 MichaReiser

@MichaReiser I haven't solved the problem, but I'm close to the cause and will share it with you. As you pointed out, my change direction was wrong, so I'm reverting the change back to using document.replaceString(). However, this will not solve the original caret position misalignment problem that I was trying to fix with this PR. https://github.com/koxudaxi/ruff-pycharm-plugin/issues/312

Upon further investigation. If RuffFixProcessor and RuffFormatProcessor perform ruff fix and ruff format, respectively, their results are different then the caret position is off. We are currently doing this with a different class, but a single class may solve the problem.

I will try again tomorrow.

koxudaxi avatar Mar 01 '24 18:03 koxudaxi

@MichaReiser @KotlinIsland

I continued our investigation of this matter over the weekend. As it turns out, it is difficult to easily move the caret location to the proper place with the IDEA functionality.

This is because the latest PyCharm has Black integration and we checked the behavior. It behaved similarly to what @KotlinIsland pointed out in this issue, so we think that PyCharm's standard functionality does not take into account such an exact movement of the caret.

https://blog.jetbrains.com/pycharm/2023/07/2023-2-eap-5/

https://github.com/JetBrains/intellij-community/blob/4733d0c010b0a8e1a53379b9862d7e25425d5d04/python/src/com/jetbrains/python/black/BlackFormattingService.kt#L107-L161

However, since we are using the AsyncDocumentFormattingService, we should proceed to replace the postformatter class with this one in ruff plugin.

It is technically possible to move the caret on IDEA, but it would be quite difficult to locate it properly.

For example, here is an example given in this issue. Displaying diffs when ruff format is performed

https://github.com/koxudaxi/ruff-pycharm-plugin/issues/312

         last_pos = link.find('>; rel="last')
-        page_pos = link.rfind(page_str := 'page=', 0, last_pos)
-        return int(link[page_pos + len(page_str): last_pos]), res
+        page_pos = link.rfind(page_str := "page=", 0, last_pos)
+        return int(link[page_pos + len(page_str) : last_pos]), res
<CARET>   return 1, res


 async def get_pages(
-        coro,
-        *,
-        n_concur,
-        per_page: int = 100,
-        page_num: int = 0,
-        **load,
+    coro,
+    *,
+    n_concur,
+    per_page: int = 100,
+    page_num: int = 0,
+    **load,
 ):

If the caret is at the beginning of a retrun line, it will be shifted forward according to the offset of the caret because of the additional space on the line before it. To prevent this, it seems necessary to check the contents of the diff and parse the python to determine the position before and after the move.

If you move the caret to a perfect position it seems easier to return the caret position with a ruff as you have presented. Of course it depends on the balance between demand and cost, but what do you think?

koxudaxi avatar Mar 04 '24 10:03 koxudaxi

Thanks for the excellent write up and your examples. I'm able to reproduce the same issue with Black when using your async example and positioning the cursor at the start of a parameter.

async def get_pages(
        coro,
        *,
        n_concur,
        per_page: int = 100,
        page_num: int = 0,
        **load,
 ):
  pass

I first assumed that this might be a Python specific problem but it is not... You can reproduce the same problem with RustRover when using rust

fn test(
        a: &str,
        b: &str, 
        c: &str // Position the cursor at the start of c
) {
    
}

/ / Formatted
fn test(a: &str, b: &str, c: &str) {}
// Cursor is positioned after the function after saving

I tried the same examples with VS Code and VS Code tracks the cursor position perfectly.

We can add support for tracking cursor positions in ruff. Ideally, it would use the source maps generated during range formatting but that's not possible because the parser generates block ranges that are slightly off (https://github.com/astral-sh/ruff/pull/9751). That's why we need to track the cursor (possibly multiple cursor positions) separately.

I don't aim for a perfect mapping. E.g. I think it's okay if the cursor moves inside a binary expression, this to keep the extra complexity lower. What I have in mind is that we store an extra cursor_positions: Vec<TextSize> in the FormatContext. We iterate over the positions in FormatNodeRule and emit a new CursorPosition IR element at the start or end of the node (we need to figure out the exact rule). I don't think the CursorPosition IR must store any data (storing the offset from the node's start could lead to the same problems as outlined above). The Printer pushes the current offset into the new cursor_positions vector and forwards it to the Formatted output.

Now, what's unclear to me is how we make the cursor position accessible to the CLI caller. We can't just print the offsets to stdout, or it messes up the formatted output.

MichaReiser avatar Mar 04 '24 15:03 MichaReiser

@MichaReiser

I tried the same examples with VS Code and VS Code tracks the cursor position perfectly.

Yes, I tried same one :sweat_smile: VS Code is great :smile_cat:

Thank you for giving us a realistic idea. 

Now, what's unclear to me is how we make the cursor position accessible to the CLI caller. We can't just print the offsets to stdout, or it messes up the formatted output.

I imaged ruff will return a json structure as stdout when we add the ---json option. Or, If stdout is not available, I guess I could show the position in stderr, but would that be too tricky?

koxudaxi avatar Mar 04 '24 16:03 koxudaxi

I imaged ruff will return a json structure as stdout when we add the ---json option. The format command doesn't support an --output-format today and I don't think I want to add it "just" for formatting.

Or, If stdout is not available, I guess I could show the position in stderr, but would that be too tricky?

That's what prettier does. It writes one line with the cursor's position to stderr for every file it formats. But that doesn't scale well once you have multiple cursors but I don't want to do anything fancy, because it would than be better to use JSON.

@snowsignal is working on a Rust-based LSP rewrite. The advantage is that we can call into Rust APIs without needing to expose information like cursor positions over the LSP. That's why I'm inclined to push this work out until we have a better understanding on how to integrate this best into editor plugins (could the PyCharm plugin call a custom LSP method)?

MichaReiser avatar Mar 04 '24 17:03 MichaReiser

I took a quick look at Prettier. What they do is:

  • It gets the position of the closest node that fully encloses the cursor offset (we support that by using SourceMap)
  • it formats the entire code
  • It gets the position of said node in the formatted output
  • It slices the old and new text of said node
  • It inserts a special CURSOR symbol into the source text (we can pick any sequence that isn't valid Python) into the original text. It then diffs the original (with cursor symbol inserted) and the formatted code and search for the diff entry that removes the CURSOR symbol. That's the offset where the cursor should be positioned now.

https://github.com/prettier/prettier/blob/1ea2fd059a2ef8762f062e3540b9d48596c43b58/src/main/core.js#L101-L125

MichaReiser avatar Mar 04 '24 17:03 MichaReiser

@MichaReiser

@snowsignal is working on a Rust-based LSP rewrite. The advantage is that we can call into Rust APIs without needing to expose information like cursor positions over the LSP.

Very interesting. Which one of you is working on this one?

That's why I'm inclined to push this work out until we have a better understanding on how to integrate this best into editor plugins (could the PyCharm plugin call a custom LSP method)?

I agree with you.

There are some issues and challenges with the pycharm plugin, so I'll take some time to sort them out a bit.

koxudaxi avatar Mar 16 '24 04:03 koxudaxi

Very interesting. Which one of you is working on this one?

@snowsignal is working on the LSP rewrite

There are some issues and challenges with the pycharm plugin, so I'll take some time to sort them out a bit.

Interesting. Do you have a summary of the issues you're running into, or would you recommend reading through the open issues in the plugin's GitHub repository?

MichaReiser avatar Mar 18 '24 07:03 MichaReiser

It is very likely that some issues have already been resolved and some bugs are listed in the same issue, so I will organize the issues. I will then share the situation with you. (I hope to have it done this week.)

koxudaxi avatar Mar 18 '24 08:03 koxudaxi

@MichaReiser Sorry for the late reply. I haven't been able to figure out all the ticket details yet, but I have a few bugs and tickets sorted out and will share them with you.

My understanding is that there are currently no critical bugs that affect all users. (at least I haven't been able to reproduce it in my environment)

I will not write the details in this issue as it is not a direct topic.

I have tagged the issue here, so please check it when you have time. https://github.com/koxudaxi/ruff-pycharm-plugin/issues

There are PRs about making the external formatter and Ruff executable directory settings variable, and if I have time, I would like to proceed with this one. https://github.com/koxudaxi/ruff-pycharm-plugin/pulls I would also like to address other issues tagged with enhancement and bug as soon as I have time.

koxudaxi avatar Apr 02 '24 17:04 koxudaxi