dvc icon indicating copy to clipboard operation
dvc copied to clipboard

`dvc list .`: it'll throw `ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence` on windows 10

Open allenyllee opened this issue 3 years ago • 2 comments

Bug Report

Description

Every time I execute dvc list . on my windows 10 (Traditional Chinese), it'll throw ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence, so I can not list anything.

Reproduce

  1. Ensure you have Traditional Chinese version windows 10
  2. Install DVC through Windows Installer release or pip install dvc
  3. prepare the data
  4. execute dvc list .

Expected

a list like:

.dvcignore
.gitignore
.vscode
README.md
data
notebook
requirements.txt
results
results.dvc
utils
var

Environment information

Output of dvc doctor:

$ dvc doctor

Additional Information (if any):

PS D:\Projects\application-summarization> dvc list . --verbose           
2022-07-26 19:39:47,370 ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5: 
illegal multibyte sequence
------------------------------------------------------------
Traceback (most recent call last):
  File "dvc\cli\__init__.py", line 185, in main
  File "dvc\cli\command.py", line 36, in do_run
  File "dvc\commands\ls\__init__.py", line 31, in run
  File "dvc\repo\ls.py", line 46, in ls
  File "dvc\repo\ls.py", line 63, in _ls
  File "dvc_objects\fs\base.py", line 346, in info
  File "dvc\fs\dvc.py", line 402, in info
  File "dvc_objects\fs\base.py", line 346, in info
  File "dvc\fs\data.py", line 125, in info
  File "funcy\objects.py", line 28, in __get__
  File "dvc\repo\index.py", line 192, in tree
  File "dvc\data_cloud.py", line 41, in get_remote_odb
  File "dvc\data_cloud.py", line 64, in _init_odb
  File "dvc_objects\fs\base.py", line 78, in __init__
  File "dvc_objects\fs\implementations\ssh.py", line 48, in _prepare_credentials
  File "sshfs\config.py", line 23, in parse_config
  File "asyncssh\config.py", line 387, in load
  File "asyncssh\config.py", line 307, in parse
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence    
------------------------------------------------------------
2022-07-26 19:39:48,432 DEBUG: link type reflink is not available ([Errno 129] no more link types left to try out)
2022-07-26 19:39:48,434 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,436 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,438 DEBUG: link type symlink is not available ([WinError 1314] 用戶端沒有這項特殊 
權限。: 'D:\\Projects\\application-summarization\\.dvc\\cache\\.3E7HmLXcoVQBCU4zrbCpio.tmp' -> 'D:\\Projects\\.4aL9LkHtucpqniZTgPreVg.tmp')
2022-07-26 19:39:48,440 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,442 DEBUG: Removing 'D:\Projects\application-summarization\.dvc\cache\.3E7HmLXcoVQBCU4zrbCpio.tmp'
2022-07-26 19:39:48,447 DEBUG: Version info for developers:
DVC version: 2.15.0 (exe)
---------------------------------
Platform: Python 3.9.13 on Windows-10-10.0.19043-SP0
Supports:
        azure (adlfs = 2022.7.0, knack = 0.9.0, azure-identity = 1.10.0),
        gdrive (pydrive2 = 1.14.0),
        gs (gcsfs = 2022.5.0),
        hdfs (fsspec = 2022.5.0, pyarrow = 8.0.0),
        webhdfs (fsspec = 2022.5.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
        s3 (s3fs = 2022.5.0, boto3 = 1.21.21),
        ssh (sshfs = 2022.6.0),
        oss (ossfs = 2021.8.0),
        webdav (webdav4 = 0.9.7),
        webdavs (webdav4 = 0.9.7)
Cache types: hardlink
Cache directory: NTFS on D:\
Caches: local
Remotes: ssh
Workspace directory: NTFS on D:\
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2022-07-26 19:39:48,457 DEBUG: Analytics is enabled.
2022-07-26 19:39:48,461 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\ALLENL~1\\AppData\\Local\\Temp\\tmp4_9bau78']'
2022-07-26 19:39:48,466 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\ALLENL~1\\AppData\\Local\\Temp\\tmp4_9bau78']'

allenyllee avatar Jul 26 '22 11:07 allenyllee

If I use local folder as default remote, it works fine. So this problem seems happened only when I use ssh remote. And from Traceback, the error raised on asyncssh package. So this error may due to the under lying ssh package of DVC.

How can I workaround this issue?

allenyllee avatar Jul 26 '22 12:07 allenyllee

It looks like the issue happens when asyncssh tries to parse your OpenSSH configuration file (~/.ssh/config). asyncssh tries to load and parse the file using your terminal's system encoding (cp950), but the SSH config file is probably in utf-8. You may need to set whatever terminal/shell you are using to use utf-8 as the default encoding

pmrowla avatar Aug 02 '22 12:08 pmrowla