Nuklear icon indicating copy to clipboard operation
Nuklear copied to clipboard

[CRITICAL] Fix clipboard text operations and UTF-8 cursor handling

Open PavelSharp opened this issue 2 months ago • 16 comments

Edit: Improve structure This pull request effectively repairs the entire clipboard handling system in Nuklear. To achieve this, it addresses three separate bugs:

Note: see #840 for more details and a minimal reproducible example.

PavelSharp avatar Oct 18 '25 00:10 PavelSharp

EDIT: This is fixed by later commits. Hiding this comment...

@PavelSharp Not sure, if the code is right or not, but I just wanted to note few things:

This is how pasting żółć looks like when done on master branch (it puts 8 characters in total which seems broken) image and here is the same text, but on PR branch (puts 4 characters as expected so we have an improvement) image

So this one bug seems fixed, however, I have some problems with your PR. Whenever I try to paste the text into "edit box", it always resets the cursor (only visually). It also fails to copy and re-paste the very same text. See the video:

https://github.com/user-attachments/assets/fb46d1e9-3122-40d5-a4d4-b7daa22fdee3

Note that those issues do not appear on master, only on PR branch. So... there is either another bug, or your PR fixes and breaks something at the same time.

I'm running Nuklear/demo/sdl_renderer on Linux with Xorg

sleeptightAnsiC avatar Oct 19 '25 15:10 sleeptightAnsiC

Hi, @sleeptightAnsiC!
Thank you for your comment - I really appreciate the feedback. I carefully reviewed your notes and started digging into the issue. While doing so, I actually discovered two fantastic bugs in Nuklear that (hopefully!) explain all the clipboard and UTF-8 handling problems. Let me show you what I found:

1 Cursor position issue The problem is caused by an incorrect cursor offset in line 27194 (nk_textedit_paste). Obviously, we want to move the cursor by the number of codepoints, not bytes. So instead of:

state->cursor += len;

it should be:

state->cursor += glyphs;

2 Clipboard copy bug - this one is mind-blowing Nuklear calls clip->copy only once (line 28172 in nk_do_edit), like this:

edit->clip.copy(edit->clip.userdata, text, end - begin);

The backend must provide its own implementation of this function. But guess what? It expects the length in bytes, not in codepoints (I double-checked nk_glfw3_clipboard_copy and nk_sdl_clipboard_copy). So that’s the root cause! Luckily, it’s easy to fix. Here’s the corrected version:

text = nk_str_at_const(&edit->string, begin, &unicode, &glyph_len);
text2 = nk_str_at_const(&edit->string, end, &unicode, &glyph_len);

if (edit->clip.copy)
    edit->clip.copy(edit->clip.userdata, text, text2 - text);

After these changes, everything starts working properly, though this is still just a draft fix. Would you like to give it a try? 😇 As an optional step towards full unicode support, it will be useful to take another look at #829.

PavelSharp avatar Oct 19 '25 18:10 PavelSharp

@PavelSharp Are you using LLM (like ChatGPT/Copilot) for writing your comments or/and patches ? I wasn't sure, if you did when I read some of your comments on other PR few days ago, but now I'm 99% sure you did use it on your last comment. https://github.com/Immediate-Mode-UI/Nuklear/issues/840 also feels LLM-generated. Please notice that LLMs make too many mistakes, and you should not use them without clear disclosure.

sleeptightAnsiC avatar Oct 19 '25 19:10 sleeptightAnsiC

The issues that I mentioned in https://github.com/Immediate-Mode-UI/Nuklear/pull/841#issuecomment-3419762564 does not occur after https://github.com/Immediate-Mode-UI/Nuklear/pull/841/commits/3342ffca65d136406337fee255e60b71bc47fe12

But a possibility that you used LLM to fix those worries me a lot...

sleeptightAnsiC avatar Oct 19 '25 19:10 sleeptightAnsiC

@sleeptightAnsiC I may use LLMs to help phrase comments more clearly, but all code and reasoning are my own. Obviously, LLMs are very good at the task of writing in beautiful English and code reviews. This technology saves time for low-responsibility tasks. Isn't that wonderful? Besides that, I always carefully review and verify any outputs before posting.

PavelSharp avatar Oct 19 '25 20:10 PavelSharp

The problem is that https://github.com/Immediate-Mode-UI/Nuklear/pull/841#issuecomment-3419854697 fells like LLM read, found solution and responded, all on its own without human involved. I would rather read whatever you put into LLM than its pseudo-correct output.

This technology saves time for low-responsibility tasks. Isn't that wonderful?

It may save your time when writing this, but it wastes everyone's time when reading and validating it.

@RobLoach Please, be very careful when merging any of this.

sleeptightAnsiC avatar Oct 19 '25 20:10 sleeptightAnsiC

@sleeptightAnsiC Concerns about LLM usage in this context appear to reflect a misunderstanding of how such tools are integrated into software development and debugging processes.

When working with complex code, some of us don't just ask an LLM a single question. They use it to help examine code, test ideas, and debug iteratively. Every choice is thoroughly validated.

Basically, LLMs are used as assistants for summarizing and presenting information. They do not replace human reasoning, verification, or the rigorous search for root causes.

The technical work behind these patches: analysis, testing, and verification is entirely done by me. Saying that an "LLM-generated style" implies a lack of human expertise seriously undervalues the time and professional skill dedicated to producing these issues and PRs.

Please, be very careful when merging any of this.

Such a statement is counterproductive and does not reflect the spirit of open source software development. Instead of such actions, it would be much better to focus on reviewing the proposed PR.

PavelSharp avatar Oct 19 '25 22:10 PavelSharp

Something is really odd here (and I'm not talking about LLM concerns anymore, although I think my concerns are valid)

Per your comment under https://github.com/Immediate-Mode-UI/Nuklear/pull/852#discussion_r2500587803, if utf8/clipboard handling is broken everywhere (in every demo), this really feels like we have a regression here. What we should do first, is not trying to fix it, but instead we should find out why the regression has happened, what caused it and so on... Without this we may break something else, cause another regression, or run into risk that this thing will break again.

~~@PavelSharp did you try to git-bisect this issue before submitting PR/Issue ?~~

EDIT: Okey, fine... I think that the next Pavel's comment explains everything. Although, I still think this whole situation is a bit strange.

sleeptightAnsiC avatar Nov 07 '25 09:11 sleeptightAnsiC

@sleeptightAnsiC So, let’s talk specifically about the clipboard copy bug. The root cause lies in these lines: https://github.com/Immediate-Mode-UI/Nuklear/blob/fcd64f85e54b9d35eee13347748d70fa4d2e134e/src/nuklear_edit.c#L284-L291

Here we clearly see that end - begin is passed as the len argument. Since b and e represent text selection positions in the edit buffer, end - begin gives us the number of Unicode codepoints selected.

However, as I said earlier, all backends treat len as the number of bytes, not the number of codepoints. And that mismatch is the core of this bug.

As for how old this issue is. It’s actually nearly 10 years old. I traced it back to the commit where version 1.0 was released. Before that point, Nuklear was still named zahnrad.

PavelSharp avatar Nov 07 '25 14:11 PavelSharp

@RobLoach I think this PR is finally ready to be merged! All the requested changes have been addressed, and all merge conflicts have been fully resolved. Also, as mentioned in our Reviewers Guide (“Have at least one other person review the changes before merging”), an additional review and testing has already been completed by @sleeptightAnsiC in his comment. Thanks. With this, the clipboard behavior is now properly fixed across all existing backends

PavelSharp avatar Nov 14 '25 16:11 PavelSharp

an additional review and testing has already been completed by sleeptightAnsiC in his https://github.com/Immediate-Mode-UI/Nuklear/pull/841#issuecomment-3419909743.

Wut? :flushed:

I didn't test the latest changes yet. The old comment doesn't really apply here, it's outdated.

I'll try to test this soon.

sleeptightAnsiC avatar Nov 14 '25 17:11 sleeptightAnsiC

What in the ChatGPT :skull: ... I'll be happy to merge once we get the :+1:

Any nit-picks can be fixed as follow ups. Thanks again, all, for all the hard work here.

RobLoach avatar Nov 14 '25 17:11 RobLoach

Given that this PR basically fixes something that was broken since forever, and I'm afraid a lot of people had to make some workarounds related to the bug, it might be safe to mark the fix as a "breaking change" and/or increase the major version (?)

I'm definitely NOT someone who should be reviewing this. A long-time contributor should take an action here and decide.

@rswinkle Would you mind taking a look? or maybe tag someone who you think is appropriate?

sleeptightAnsiC avatar Nov 14 '25 17:11 sleeptightAnsiC

@sleeptightAnsiC

I didn't test the latest changes yet. The old comment doesn't really apply here, it's outdated.

I'm sorry, but the only real code change since that comment is just this small tweak: return nk_str_insert_at_rune(str, pos, text, len); to return nk_str_insert_at_rune(str, pos, text, len) ? len : 0;

Everything else was just rebasing and updating the branch. Honestly, it felt like a tiny detail 🙂

PavelSharp avatar Nov 14 '25 17:11 PavelSharp

I'll try to test this soon.

the only real code change since that https://github.com/Immediate-Mode-UI/Nuklear/pull/841#issuecomment-3419909743 is just this small tweak

Retested and everything seems fine.

Any nit-picks can be fixed as follow ups.

My only nitpick here is that it might be nice to have this change as one single commit (?) Those two commits don't really work correctly without each other, and the third commit is rebuilding the amalgamation, so it may confuse people. This way, if said change ever causes an issue for someone, it will be easier to find it with git-log or git-bisect. But this is just a minor inconvenience, and @RobLoach can easily solve it with "squash+merge", so no worries.

And my previous comment still applies https://github.com/Immediate-Mode-UI/Nuklear/pull/841#issuecomment-3533818168

sleeptightAnsiC avatar Nov 16 '25 16:11 sleeptightAnsiC

Just to note, there have been a couple of previous attempts to address this:

  • #764, as @sleeptightAnsiC pointed in his comment
  • #543, where the author describes the same problem but only fixes it for the GLFW backend.

This PR resolves the root cause. Personally, I believe it's obviously that the copy callback should take the number of bytes instead of glyphs.

PavelSharp avatar Nov 16 '25 17:11 PavelSharp