svn-scm
svn-scm copied to clipboard
Try using a different encoding package
I propose we test the use of chardet
for detecting character encoding. The use of this new package will be behind a settings option svn.experimental.detect_encoding
@Yanpas @Blashaq @nihlet @michelegalante
Can you all update to 2.10.0
and set svn.experimental.detect_encoding
to true
and see if you have any encoding issues please?
Can you also remove "files.encoding"
and "svn.default.encoding"
as well to make sure.
Let me know if you have any issues.
@jacobweber
Thanks for thinking of me. :) I'm still fairly convinced that my issue isn't related to character encoding, though. In fact, I've recently been seeing it with git as well, which makes me think it may actually be a vscode issue.
This extension and git use the exact same encoding detection.
Well, I'll give it a try.
I have a file in cp1251 encoding. No matter if I set or unset detect_encoding - the diff lines appear if I set cp1251 encoding in editor (with UTF8 there is no diff and a lot of �
symbols).
Diff in item history view does not respect encodings
can you send me said file?
@Yanpas can you try 2.10.1
Here it is in base64:
IyEvYmluL2Jhc2gKCkRBVEFfRElSPSIkMSIKV0VCX0RJUj0iJDIiClJPT1RfRElSPSIkMyIKClNF
TEY9IiQoIHdoaWNoICIkMCIgKSIKQklOX0RJUj0iJCggZGlybmFtZSAiJFNFTEYiICkiCgpta2Rp
ciAtcCAkV0VCX0RJUgpjcCAtcmYgJEJJTl9ESVIvaW1hZ2VzICRXRUJfRElSLy4uCiMg5ODt7fvl
LCDo5yDq7vLu8Pv1IOPl7eXw6PDz5fLx/yDu8vfl8iwg7O7j8/Ig7/Do4+7k6PL88f8g5Ov/IO7y
6+Dk6ugg6CDv8Ogg8e7n5ODt6Ogg5O7v7uvt6PLl6/zt+/UKIyDu8vfl8u7iIOTr/yDx8uDw+/Ug
8eHu8O7qCmNwICREQVRBX0RJUi8qLmNzdiAkV0VCX0RJUgoKJEJJTl9ESVIvZ2VuX3BhZ2VfYnVp
bGQuc2ggJERBVEFfRElSICRXRUJfRElSCiMg8ePl7eXw6PDz5ewganVuaXQueG1sIOIg6u7w7eUs
IOIg6u7y7vDu7CDh8+Tl8iDo7fTu8Ozg9uj/IO4g8uXx8uD1LCD38u7h+yBqZW5raW5zIO/u5PXi
4PLo6yD98vMg6O307vDs4Pbo/gokQklOX0RJUi9nZW5feG1sX3Rlc3RzLnNoICREQVRBX0RJUiAk
Uk9PVF9ESVIK
With 2.10.1 I see blue lines with both 1251 and utf8 encodings.
BTW I disabled vscode's encoding autodetection since it's very buggy, often treats utf8 files as cp1252
https://github.com/microsoft/vscode/issues/85480 https://github.com/microsoft/vscode/issues/33720
Thanks for that @Yanpas will see if i can fix it today
jschardet
guesses the correct encoding for your file. it is the default encoding of utf8
and auto-guess being disabled that doesn't get it to run.
https://github.com/JohnstonCode/svn-scm/blob/master/src/svnRepository.ts#L313 https://github.com/JohnstonCode/svn-scm/blob/master/src/svnRepository.ts#L336
Maybe an encoding detection priority list? https://github.com/microsoft/vscode/issues/85480#issuecomment-579880763
Can you give 2.10.2
as try. This adds a new config option svn.experimental.encoding_priority
which takes an array of encoding types and prioritise them based on order. For example ["UTF-8", "GB18030", "windows-1251"]
will prioritise UTF-8
then GB18030
ect. If there are no matches it will just return null and will either use svn.default.encoding
or UTF-8
.
This helps
Things would be much easier if vscode provided chosen encoding. Maybe request API extending?
Glad that is working for you. As per your linked issues i think they need to decide on how they will handle it internally before they do any API work.
There still seans to be a problem- function createTempSvnRevisionFile in temp_svn_fs.ts seems to break characters, because it converts encoded buffer to JS string before saving file. Js strings are always utf-8 encoded. Working on buffers in this case solves the problem.
pt., 21 lut 2020, 12:44 użytkownik Christopher [email protected] napisał:
Glad that is working for you. As per your linked issues i think they need to decide on how they will handle it internally before they do any API work.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JohnstonCode/svn-scm/issues/830?email_source=notifications&email_token=AGS36IMMUUMXTGQFDMBWP6LRD65DHA5CNFSM4KX4HPC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMSOGXI#issuecomment-589620061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGS36IIC3POYW5RE55GTB2DRD65DHANCNFSM4KX4HPCQ .
I will create pull request with this change for you to review.
pt., 21 lut 2020, 13:16 użytkownik Blashaqu . [email protected] napisał:
There still seans to be a problem- function createTempSvnRevisionFile in temp_svn_fs.ts seems to break characters, because it converts encoded buffer to JS string before saving file. Js strings are always utf-8 encoded. Working on buffers in this case solves the problem.
pt., 21 lut 2020, 12:44 użytkownik Christopher [email protected] napisał:
Glad that is working for you. As per your linked issues i think they need to decide on how they will handle it internally before they do any API work.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JohnstonCode/svn-scm/issues/830?email_source=notifications&email_token=AGS36IMMUUMXTGQFDMBWP6LRD65DHA5CNFSM4KX4HPC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMSOGXI#issuecomment-589620061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGS36IIC3POYW5RE55GTB2DRD65DHANCNFSM4KX4HPCQ .
thanks @Blashaq
I'm still seeing the diffs, using the latest version of the plugin.
Can you give
2.10.2
as try. This adds a new config optionsvn.experimental.encoding_priority
which takes an array of encoding types and prioritise them based on order. For example["UTF-8", "GB18030", "windows-1251"]
will prioritiseUTF-8
thenGB18030
ect. If there are no matches it will just return null and will either usesvn.default.encoding
orUTF-8
.
Have you given this a try?
No, I was just using svn.experimental.detect_encoding: true
. I just added `"svn.experimental.encoding_priority": ["UTF-8"] and reloaded the window. I'll keep an eye on it and see if it appears again.
Still seeing it -- same symptoms (log shows a "svn info" with no "svn cat").