micro icon indicating copy to clipboard operation
micro copied to clipboard

Peculiar Dos-Unix Unicode bug

Open sergeevabc opened this issue 3 years ago • 13 comments

Background

Windows 7 (chcp 65001), micro 2.0.8 (cfcb2e45).

Aggrrhh

  1. Run micro create.cmd, Add a few lines mkdir "Hello dear" & mkdir "Привет дорогой", then Save and Quit.

  2. Run create.cmd. We see that 2 dirs are successfully created. Delete them for now.

  3. Run micro create.cmd again. This time it’s not dos, suddenly it’s unix. Add a new line mkdir 123, Save, Quit.

  4. This time create.cmd won’t be able to create one dir (with a Russian name) as follows

    '�вет' is not recognized as an internal or external command, operable program or batch file.
    

Workaround

If we run micro create.cmd once again, then CTRL+E, then set fileformat dos, then Save, it will work.

Conclusion

Something is not right here, because create.cmd with the same contents created by other editors works as expected.

sergeevabc avatar Mar 20 '21 12:03 sergeevabc

@zyedidia, are you there? Err… alive?

sergeevabc avatar Mar 29 '21 21:03 sergeevabc

Yep, mostly still alive! I'm not sure what the issue here might be. Do you ever open files that shouldn't be the dos format? In ~/.config/micro/settings.json, is the fileformat option specified? You can also specify the fileformat as a flag: micro -fileformat dos create.cmd, maybe this well be helpful in the short-term for managing the issue.

zyedidia avatar Mar 29 '21 21:03 zyedidia

$ rg fileformat %UserProfile%/.config/micro/settings.json
6:    "fileformat": "dos", 

However, I found eol-new-line (or something like that) setting in the same file, which was set to off. After setting it to on, then running Micro, things got better in aforementioned scenario — no unix anymore, dos only. Perhaps, it’s a clue: micro might identify as unix if file has no last eol. Bonus oddity: preference eol-new-line then disappeared from settings.json.

sergeevabc avatar Mar 31 '21 03:03 sergeevabc

It was detected that there are Unix line endings because there was a bug where they are detected if there is no line at the end of the file even if there are DOS line endings, but the bug was fixed at c55fb33 in 2.0.13. The line endings will be the ones specified in fileformat only when opening a file that is empty, so the line endings may be Unix even if fileformat is dos when opening a file.

eofnewline was not in settings.json when it was set as true because options will not be in the file when they are set as the default value.

niten94 avatar Jan 29 '24 14:01 niten94

@niten94, tried once again with 2.0.13 under Windows 7 x64 (chcp 65001) and no configuration file. Micro saved create.cmd with Unix line endings and the second command did not create a folder. It worked only after changing line endings to Windows format, so frustrating.

sergeevabc avatar Jan 29 '24 14:01 sergeevabc

I tried testing a bit using Windows 10 and doing the steps you have written in your original comment with default settings, but I think the file was created with Unix line endings because fileformat is unix by default on Windows too. The line endings in files that are created can be set as DOS when pressing Ctrl+E then entering set fileformat dos. The line endings of a file that is open can be set when entering setlocal fileformat dos.

I am sorry that there are parts I have not explained well.

niten94 avatar Jan 29 '24 14:01 niten94

Isn't it more appropriate to set line endings specific to OS automatically when creating a new file? I believe this is what cross-platform text editors like Sublime Text do.

sergeevabc avatar Jan 29 '24 16:01 sergeevabc

Err… Hello?

sergeevabc avatar Feb 11 '24 12:02 sergeevabc

I think it is more appropiate if files are created with line endings depending on the OS by default but I have not been able to think about how it can be changed so I have not said anything about the issue. I will still try to write about it sometime.

niten94 avatar Feb 11 '24 13:02 niten94

I've uploaded PR #3141 which changes the default line ending on Windows from unix to dos. Note that I haven't tested it on Windows, since I'm not using Windows. Feel free to test it.

dmaluka avatar Feb 14 '24 00:02 dmaluka

@dmaluka, I'll be happy to test your PR in the form of a binary file for Windows 7 x64 attached to this thread. Being an ordinary user and not a developer, I am afraid to touch a compiler that is unfamiliar to me.

sergeevabc avatar Feb 14 '24 00:02 sergeevabc

Attaching micro.zip with the Windows binary inside. I've just built it via GOOS=windows make (from the latest master branch + my PR), I'm not sure if I did it correctly, so I don't know if it even starts. So please test it.

dmaluka avatar Feb 14 '24 00:02 dmaluka

$ ver
Microsoft Windows [Version 6.1.7601]

$ micro message.txt

$ file message.txt
message.txt: Unicode text, UTF-8 text, with CRLF line terminators

Since a new file is no longer created with LF terminators (Unix style) on Windows, but with CRLF, we can conclude that it works as expected. Thank you, @dmaluka.

When can we expect a regular Micro version with this fix?

sergeevabc avatar Feb 14 '24 04:02 sergeevabc