wikiteam icon indicating copy to clipboard operation
wikiteam copied to clipboard

UnicodeEncodeError: 'ascii' codec can't encode characters in position 8-10: ordinal not in range(128)

Open ploopkazoo opened this issue 8 years ago • 1 comments

The following happens in dumpgenerator.py

XML dump saved at... wikipuella_maginet-20160211-history.xml
Retrieving image filenames
...................
........    Found 13453 images
13453 image names loaded
Image filenames and URLs saved at... wikipuella_maginet-20160211-images.txt
Retrieving images from "start"
Creating "./wikipuella_maginet-20160211-wikidump/images" directory
Traceback (most recent call last):
  File "/home/kyou/dumpgenerator.py", line 2077, in <module>
    main()
  File "/home/kyou/dumpgenerator.py", line 2069, in main
    createNewDump(config=config, other=other)
  File "/home/kyou/dumpgenerator.py", line 1656, in createNewDump
    session=other['session'])
  File "/home/kyou/dumpgenerator.py", line 1120, in generateImageDump
    text=u'The page "%s" was missing in the wiki (probably deleted)' % (title.decode('utf-8'))
  File "/usr/local/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 8-10: ordinal not in range(128)

ploopkazoo avatar Feb 11 '16 13:02 ploopkazoo

Analysing http://wiki.urbandead.com/api.php
Loading config file...
Resuming previous dump process...
Title list was completed in the previous session
XML dump was completed in the previous session
Image list was completed in the previous session
29548 images were found in the directory from a previous session
Retrieving images from "39229141.jpg"
Traceback (most recent call last):
  File "dumpgenerator.py", line 2569, in <module>
    main()
  File "dumpgenerator.py", line 2559, in main
    resumePreviousDump(config=config, other=other)
  File "dumpgenerator.py", line 2276, in resumePreviousDump
    session=other['session'])
  File "dumpgenerator.py", line 1528, in generateImageDump
    text=u'The page "%s" was missing in the wiki (probably deleted)' % (title.decode('utf-8'))
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u200e' in position 6: ordinal not in range(128)

Happens to me still (although I haven't updated to the latest few commits)

TheTechRobo avatar Feb 13 '22 15:02 TheTechRobo