editorconfig-vim icon indicating copy to clipboard operation
editorconfig-vim copied to clipboard

charset=utf-8 triggers double-encoding with fileencodings=latin1

Open mowgli opened this issue 4 years ago • 7 comments

Recently I added an .editorconfig with the following content to my ansible directory:

[*]
charset = utf-8

Usually I have the setting set fileencodings=ucs-bom,latin1 in my vim config as most of my files are in latin1.

Now, when I open a yaml file inside the ansible tree that already has some utf-8 german umlauts in it, it gets opened as latin1 and then converted to utf-8 every time I open it. That leads to increasing lines with Ã~CÂ~CÃ~B¼ (in this example for ü)

I get the same content, when I (outside of that tree) open a file, put an german umlaut into it. Saving, leaving and reopening, then set fenc=utf-8 :w and start over multiple times. So it seems to me that this addon just sets fenc when opening a file inside a tree. The correct way would be to set fileencodings prior to open the buffer or reopen it with :e ++enc=utf8

mowgli avatar Apr 04 '20 09:04 mowgli

Thanks for reporting! Would you please attach a ZIP of a minimal tree (including at least .editorconfig and a YAML file) that reproduces the problem? Also, what vim and version, OS and version, and plugin version or commit? Much appreciated!

cxw42 avatar Apr 04 '20 15:04 cxw42

Well, I will. But I am not sure if that helps as it interact with the vim config. (Also, HOW can I upload a .tar.xz to github!?

However, I can give you as example my git log -p of the time from when I start using .editorconfig to the point when I found this bug. You can see the big in yaml roles/security/tasks/main.yaml.

<cut>
commit 5d288a9
Author: Klaus Ethgen <[email protected]>
Date:   Sun Mar 22 13:18:19 2020 +0100

    Firefox ist wirklich unglaublich renitent

diff --git a/roles/security/tasks/main.yaml b/roles/security/tasks/main.yaml
index 88c2516..6eaeb06 100644
--- a/roles/security/tasks/main.yaml
+++ b/roles/security/tasks/main.yaml
@@ -24,7 +24,7 @@
     state: started
   when: ansible_virtualization_role|d('none') == 'guest'
 
-- name: Installiere libjson-perl für capabilities
+- name: Installiere libjson-perl für capabilities
   tags: [install, capabilities]
   package:
     name: "{{ json_perl_pkg|default('libjson-perl') }}"
@@ -46,7 +46,7 @@
     state: present
   when: ansible_os_family == 'Debian' or ansible_lsb.id == 'Devuan'
 
-- name: Sysctl-Settings für links
+- name: Sysctl-Settings für links
   template:
     dest: /etc/sysctl.d/local_security.conf
     src: local_security.conf.j2
@@ -129,17 +129,19 @@
     - google-analytics.com
     - noscript.net
     - facebook.com
+    - firefox.settings.services.mozilla.com
+    - content-signature-2.cdn.mozilla.net
   when: "'workstations' in group_names"
 
 # TODO: Missing more /etc/hosts management
 
-# Fix für CVE-2017-6074
+# Fix für CVE-2017-6074
 - name: Workaround for CVE-2017-6074, DCCP vulnerability
   file:
     path: /etc/modprobe.d/disable-dccp.conf
     state: absent
 
-# Fix für CVE-2017-2636
+# Fix für CVE-2017-2636
 - name: Workaround for CVE-2017-2636, HDLC vulnerability
   file:
     path: /etc/modprobe.d/disable-hdlc.conf
@@ -230,13 +232,13 @@
     regexp: '^:0'
   when: wdm_config_stat.stat.exists
 
-# Fix für CVE-2019-11815
+# Fix für CVE-2019-11815
 - name: Workaround for CVE-2019-11815, RDS vulnerability
   file:
     path: /etc/modprobe.d/disable-rds.conf
     state: absent
 
-# Fix für CVE-2019-11477
+# Fix für CVE-2019-11477
 - name: Disable SACK
   sysctl:
     name: net.ipv4.tcp_sack
diff --git a/roles/security/templates/firefox-security.js.j2 b/roles/security/templates/firefox-security.js.j2
index 022d398..55a5eef 100644
--- a/roles/security/templates/firefox-security.js.j2
+++ b/roles/security/templates/firefox-security.js.j2
@@ -305,8 +305,13 @@ pref("app.normandy.api_url", "", locked);
 pref("app.normandy.enabled", false, locked);
 pref("app.shield.optoutstudies.enabled", false, locked);
 
-// Disable WebRTC
+// Disable WebRTC und OpenH264
 pref("media.peerconnection.enabled", false, locked);
+pref("media.peerconnection.turn.disable", true, locked);
+pref("media.peerconnection.use_document_iceservers", false, locked);
+pref("media.peerconnection.video.enabled", false, locked);
+pref("media.peerconnection.identity.timeout", 1, locked);
+pref("media.gmp-gmpopenh264.enabled", false, locked);
 
 // Disable Lurking to Clipboard
 pref("dom.event.clipboardevents.enabled", false, locked);
@@ -396,3 +401,7 @@ pref("services.settings.poll_interval", -1, locked);
 pref("services.sync.telemetry.submissionInterval", -1, locked);
 
 pref("browser.safebrowsing.reportPhishURL", "", locked);
+pref("browser.startup.homepage_override.mstone", "ignore", locked);
+
+// Send Media to device
+pref("browser.casting.enabled", false, locked);

<cut>

My vim version on that particular system is 8.2.0439, it is the debian version in sid. And, yes, debian/devuan linux amd64.

The plugin is the most current master branch version.

mowgli avatar Apr 04 '20 16:04 mowgli

What a crap. they don't support .xz files on github. tree.tar.gz

mowgli avatar Apr 04 '20 16:04 mowgli

OK - I see the umlaut (UTF-8 ü = 0xc3 0xbc) in the tree:

0000360: 6f 6e 2d 70 65 72 6c 20 66 c3 bc 72 20 63 61 70  on-perl f..r cap

I can reproduce by:

  • Untarring
  • cd ansible
  • vim
  • :set fileencodings=ucs-bom,latin1
  • :e roles/security/tasks/main.yaml

image

And :set fenc returns fileencoding=utf-8.

I get the same even if I :set fileencodings=ucs-bom,latin1,utf-8.

If I :set fileencodings=utf-8,ucs-bom,latin1, I do not experience the problem. Would that be an acceptable workaround with the files you typically open in this directory?

cxw42 avatar Apr 04 '20 16:04 cxw42

Well, that is exactly what is expected. fileencodings work that way. And that is the reason I cannot just change the order in that setting.

utf-8 at the end will never work as latin1 is a full 8bit charset (what utf-8 isn't). So every byte is a valid latin1 char. Using utf-8 in front would encode every new file or every file that is simple ascii in the begin to utf-8 what is wrong in the most situations for me.

And you will have the same trouble vis versa, setting charset to latin1 in .editorconfig. The file is opened in utf-8 and then converted to latin1. Although I have to say that this is less likely as not every char in utf-8 can be converted to latin1. But that would end in readonly buffers in vim.

mowgli avatar Apr 04 '20 17:04 mowgli

By the way, the same reason for this bug is the reason why a fenc setting in modline would never work.

The only way to do it correct would be to set the fileencodings before the file is opened at all or to reopen it with the ++enc setting.

mowgli avatar Apr 04 '20 17:04 mowgli

OK - thanks for the additional detail! Marked as a bug; PRs welcome :) . Right now we are working through releasing the existing code under a stable version number (#143), after which I will look at this if someone else doesn't beat me to it.

cxw42 avatar Apr 05 '20 00:04 cxw42