dvc
dvc copied to clipboard
`dvc list .`: it'll throw `ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence` on windows 10
Bug Report
Description
Every time I execute dvc list . on my windows 10 (Traditional Chinese), it'll throw ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence, so I can not list anything.
Reproduce
- Ensure you have Traditional Chinese version windows 10
- Install DVC through Windows Installer release or
pip install dvc - prepare the data
- execute
dvc list .
Expected
a list like:
.dvcignore
.gitignore
.vscode
README.md
data
notebook
requirements.txt
results
results.dvc
utils
var
Environment information
Output of dvc doctor:
$ dvc doctor
Additional Information (if any):
PS D:\Projects\application-summarization> dvc list . --verbose
2022-07-26 19:39:47,370 ERROR: unexpected error - 'cp950' codec can't decode byte 0xe5 in position 5:
illegal multibyte sequence
------------------------------------------------------------
Traceback (most recent call last):
File "dvc\cli\__init__.py", line 185, in main
File "dvc\cli\command.py", line 36, in do_run
File "dvc\commands\ls\__init__.py", line 31, in run
File "dvc\repo\ls.py", line 46, in ls
File "dvc\repo\ls.py", line 63, in _ls
File "dvc_objects\fs\base.py", line 346, in info
File "dvc\fs\dvc.py", line 402, in info
File "dvc_objects\fs\base.py", line 346, in info
File "dvc\fs\data.py", line 125, in info
File "funcy\objects.py", line 28, in __get__
File "dvc\repo\index.py", line 192, in tree
File "dvc\data_cloud.py", line 41, in get_remote_odb
File "dvc\data_cloud.py", line 64, in _init_odb
File "dvc_objects\fs\base.py", line 78, in __init__
File "dvc_objects\fs\implementations\ssh.py", line 48, in _prepare_credentials
File "sshfs\config.py", line 23, in parse_config
File "asyncssh\config.py", line 387, in load
File "asyncssh\config.py", line 307, in parse
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe5 in position 5: illegal multibyte sequence
------------------------------------------------------------
2022-07-26 19:39:48,432 DEBUG: link type reflink is not available ([Errno 129] no more link types left to try out)
2022-07-26 19:39:48,434 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,436 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,438 DEBUG: link type symlink is not available ([WinError 1314] 用戶端沒有這項特殊
權限。: 'D:\\Projects\\application-summarization\\.dvc\\cache\\.3E7HmLXcoVQBCU4zrbCpio.tmp' -> 'D:\\Projects\\.4aL9LkHtucpqniZTgPreVg.tmp')
2022-07-26 19:39:48,440 DEBUG: Removing 'D:\Projects\.4aL9LkHtucpqniZTgPreVg.tmp'
2022-07-26 19:39:48,442 DEBUG: Removing 'D:\Projects\application-summarization\.dvc\cache\.3E7HmLXcoVQBCU4zrbCpio.tmp'
2022-07-26 19:39:48,447 DEBUG: Version info for developers:
DVC version: 2.15.0 (exe)
---------------------------------
Platform: Python 3.9.13 on Windows-10-10.0.19043-SP0
Supports:
azure (adlfs = 2022.7.0, knack = 0.9.0, azure-identity = 1.10.0),
gdrive (pydrive2 = 1.14.0),
gs (gcsfs = 2022.5.0),
hdfs (fsspec = 2022.5.0, pyarrow = 8.0.0),
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
https (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21),
ssh (sshfs = 2022.6.0),
oss (ossfs = 2021.8.0),
webdav (webdav4 = 0.9.7),
webdavs (webdav4 = 0.9.7)
Cache types: hardlink
Cache directory: NTFS on D:\
Caches: local
Remotes: ssh
Workspace directory: NTFS on D:\
Repo: dvc, git
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2022-07-26 19:39:48,457 DEBUG: Analytics is enabled.
2022-07-26 19:39:48,461 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\ALLENL~1\\AppData\\Local\\Temp\\tmp4_9bau78']'
2022-07-26 19:39:48,466 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\ALLENL~1\\AppData\\Local\\Temp\\tmp4_9bau78']'
If I use local folder as default remote, it works fine. So this problem seems happened only when I use ssh remote.
And from Traceback, the error raised on asyncssh package. So this error may due to the under lying ssh package of DVC.
How can I workaround this issue?
It looks like the issue happens when asyncssh tries to parse your OpenSSH configuration file (~/.ssh/config). asyncssh tries to load and parse the file using your terminal's system encoding (cp950), but the SSH config file is probably in utf-8. You may need to set whatever terminal/shell you are using to use utf-8 as the default encoding