django-import-export
django-import-export copied to clipboard
Suggestion:The encode conversion
Sir,I think over this question in django
UnicodeDecodeError at /admin/core/book/import/
'utf8' codec can't decode byte 0xc4 in position 0: invalid continuation byte
then I use vim set fileencoding
,I found the csv standard of file exported from the django-import-export
is UTF-8
,I search it on internet,I also found that if i use utf-8
standard file to import,it's OK.
Maybe I use the OS in Chinese language so it cause the problem.Would you please modify the project:If the encode of import-file
or export-file
is not utf-8
,first convert it to utf-8
,then process other code?
I have write some code to fix the encode convertion:
import chardet
def convertEncoding(from_encode,to_encode,old_filepath,target_file):
f1=file(old_filepath)
content2=[]
while True:
line=f1.readline()
content2.append(line.decode(from_encode).encode(to_encode))
if len(line) ==0:
break
f1.close()
f2=file(target_file,'w')
f2.writelines(content2)
f2.close()
convertFile = open('1234.csv','r')
data = convertFile.read()
convertFile.close()
convertEncoding(chardet.detect(data)['encoding'], "utf-8", "1234.csv", "1234_bak.csv")
I am a newbie , my code is not concise.would you please think about that and integrite the regular to the project?I very like this project,thanks for your reputation!
Have you checked docs specifically settings for from_encoding
and to_encoding
?
Yes sir,I test it in my code,in my OS environment it still cause the error.
Does from_encoding='utf-8'
means the encode of import-file
must be utf-8?
I also change the parameter to from_encoding='GB2312’
follow my csv file's encoding,it also cause error~And when I convert the file to utf-8
,it runs well~
Please see how data is encoded and decoded:
https://github.com/bmihelac/django-import-export/blob/master/import_export/admin.py
I also have same problem under chinese os while file content is encoded with utf8.
Would it be the problem of os environment setting and should open file with open(..., encoding="...") not just open filename&read_mode, to solve that?
my traceback: Traceback: File "D:\My Documents\Workspaces\xbcWeb\xbcWeb\env\lib\site-packages\django\core\handlers\base.py" in get_response
-
File "D:\My Documents\Workspaces\xbcWeb\xbcWeb\env\lib\site-packages\django\utils\decorators.py" in _wrapped_viewresponse = wrapped_callback(request, _callback_args, *_callback_kwargs)
-
File "D:\My Documents\Workspaces\xbcWeb\xbcWeb\env\lib\site-packages\django\views\decorators\cache.py" in _wrapped_view_funcresponse = view_func(request, _args, *_kwargs)
-
File "D:\My Documents\Workspaces\xbcWeb\xbcWeb\env\lib\site-packages\django\contrib\admin\sites.py" in innerresponse = view_func(request, _args, *_kwargs)
-
File "D:\My Documents\Workspaces\xbcWeb\xbcWeb\env\lib\site-packages\import_export\admin.py" in import_actionreturn view(request, _args, *_kwargs)
-
data = uploaded_import_file.read()
I tried and can fixed the exception with this code. please patch it if it's suitable. i am not familiar with github.
in process_import() import_file = open(import_file_name, input_format.get_read_mode(), encoding=self.from_encoding)
in import_action() with open(uploaded_file.name, input_format.get_read_mode(), encoding=self.from_encoding) as uploaded_import_file:
Still having this problem using v1.1.0, python3, in ubuntu16.04 when trying to import a CSV file with latin characters. Also gives the error even if you specify the "from_encoding" attribute.
class MyImportMixin(ImportMixin):
formats = (CSV,)
from_encoding = 'latin-1'
class UserTmpAdmin(MyImportMixin, admin.ModelAdmin):
resource_class = UsuariosTmpResource
admin.site.register(Usuarios_temporales, UserTmpAdmin)
I also tested tablib in the shell and works ok...
Any ideas of what could be wrong. :-( The most strange thing is that works like a charm in Windows, but same base code fails when deloyed in ubuntu 16.06.
I've hit similar problems, and there are several distinct problems with the current code:
1: Encoding is happening in the wrong place
... should probably be happening with standard Python open()
& encoding=
, rather than fetching the data and using force_text()
on it. (And while this is being fixed: it may be better to default to the utf-8-sig
encoding rather than plain utf-8
, but only for reading, as it will detect and skip the BOM if there is one.)
2: force_text() exceptions aren't being caught
And if the code is going to use force_text()
, it should catch the correct exception (DjangoUnicodeDecodeError
, not UnicodeDecodeError
)
3: Doesn't handle universal newline types
The code already knows what open()
modes to use for each format, thanks to base_formats
. Text formats should have the U
(universal newline) flag added. (Update: the U
flag is deprecated in open()
now. Use text
mode or set newline=None
)
4: Should just raise the actual exception rather than doing an HttpResponse
+ <h1>
simulation of an error
No wonder nothing was showing up in my error logs. forehead-slap
@yozlet great tip for utf-8-sig
while such issues have almost always missed reproducible test case, I'm totally for making library more robust when handling different encodings.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi guys, Any updates on this, currently experiencing similar issues.
Me too
Imported file has a wrong encoding: 'ascii' codec can't decode byte 0xc3 in position 31: ordinal not in range(128)
Can't get it parsed with utf-8 in deploy environment, although in development works fine
Me too
Imported file has a wrong encoding: 'ascii' codec can't decode byte 0xc3 in position 2011: ordinal not in range(128)
only in Production environment, a json, @bmihelac why if file is encoded into UTF-8?? Django 2.2
4: Should just raise the actual exception rather than doing an HttpResponse + h1 simulation of an error
Implemented in PR #1281 (although the error is presented back in the UI as a form error)
I have created #1306 based on the suggestions made by @yozlet here. I cannot reproduce the import errors but it would be great if anyone who has commented previously in this thread can test the PR to see if it resolves their issues.
I want to know if had resolved this issu? I uploaded file's language is chinese char, the problem is same as your...
@bowuL Please can you try this branch and let us know if the problem still exists?
I had the same issue and now it's fixed. Thx.
@jairodri Thanks - was that using the new branch?
yes, i'm using it :)
@matthewhegarty Sorry for the late reply. I didn't use other branch, just change the source code and rewrite the FolderStorage
class
@bowuL thanks - if you could try the other branch that would be great, as it would help us understand whether the proposed fix is going to work.
Release 3.0 (beta) is now available, so anyone who is hitting this issue is encouraged to test with v3.0-0-beta.
Release 3.0 (beta) is now available, so anyone who is hitting this issue is encouraged to test with v3.0-0-beta.
I had a similar issue on django-import-export==2.8.0
'charmap' codec can't decode byte 0x8f in position 29
After upgrading to django-import-export==3.0.0b4, i did not get this anymore
Thanks
Closing - this should be fixed after release v3 - please raise new issue if still occurring.