launchpad
launchpad copied to clipboard
Unicode Error
It's not clear whether this is an error with the data for a particular item, or whether a certain query triggered this error, but it warrants looking into. Initial checking observed that
-
/item/12344364.json
results in an "Internal Server Error" page -
/item/12344364
results in item not found
The query referenced in the error appears to be collección
.
Internal Server Error: /item/12344364.json
Traceback (most recent call last):
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/utils/decorators.py", line 110, in _wrapped_view
response = view_func(request, *args, **kwargs)
File "/launchpad/current/launchpad/lp/ui/views.py", line 135, in item_json
bib_data = voyager.get_bib_data(bibid)
File "/launchpad/current/launchpad/lp/ui/voyager.py", line 199, in get_bib_data
bib.get('TITLE', ''))
File "/launchpad/current/launchpad/lp/ui/voyager.py", line 344, in get_related_bibids
results = _make_dict(cursor)
File "/launchpad/current/launchpad/lp/ui/voyager.py", line 33, in _make_dict
for row in cursor.fetchall()
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/db/utils.py", line 105, in inner
return func(*args, **kwargs)
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 517, in fetchall
return tuple(_rowfactory(r, self.cursor) for r in self.cursor.fetchall())
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 517, in <genexpr>
return tuple(_rowfactory(r, self.cursor) for r in self.cursor.fetchall())
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 599, in _rowfactory
value = to_unicode(value)
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 610, in to_unicode
return force_text(s)
File "/launchpad/current/launchpad/ENV/lib/python2.7/site-packages/django/utils/encoding.py", line 102, in force_text
raise DjangoUnicodeDecodeError(s, *e.args)
DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 19: unexpected end of data. You passed in '9562440710 (colecci\xf3n)' (<type 'str'>)
Request repr():
<WSGIRequest
path:/item/12344364.json,
GET:<QueryDict: {}>,
POST:<QueryDict: {}>,
COOKIES:{},
META:{'CONTEXT_DOCUMENT_ROOT': '/var/www',
'CONTEXT_PREFIX': '',
u'CSRF_COOKIE': u'f0G9gMwAp0tVlBuYpzlx7y1gy9JOY7Cs',
'DOCUMENT_ROOT': '/var/www',
'GATEWAY_INTERFACE': 'CGI/1.1',
'HTTP_ACCEPT': '*/*',
'HTTP_ACCEPT_ENCODING': 'gzip,deflate',
'HTTP_CONNECTION': 'Keep-Alive',
'HTTP_FROM': '[email protected]',
'HTTP_HOST': 'findit.library.gwu.edu',
'HTTP_USER_AGENT': 'Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)',
'PATH_INFO': u'/item/12344364.json',
'PATH_TRANSLATED': '/launchpad/current/launchpad/lp/lp/wsgi.py/item/12344364.json',
'QUERY_STRING': '',
'REMOTE_ADDR': '5.255.250.63',
'REMOTE_PORT': '36230',
'REQUEST_METHOD': 'GET',
'REQUEST_SCHEME': 'https',
'REQUEST_URI': '/item/12344364.json',
'SCRIPT_FILENAME': '/launchpad/current/launchpad/lp/lp/wsgi.py',
'SCRIPT_NAME': u'',
'SERVER_ADDR': '192.245.136.25',
'SERVER_ADMIN': '[email protected]',
'SERVER_NAME': 'findit.library.gwu.edu',
'SERVER_PORT': '443',
'SERVER_PROTOCOL': 'HTTP/1.1',
'SERVER_SIGNATURE': '<address>Apache/2.4.7 (Ubuntu) Server at findit.library.gwu.edu Port 443</address>\n',
'SERVER_SOFTWARE': 'Apache/2.4.7 (Ubuntu)',
'SSL_TLS_SNI': 'findit.library.gwu.edu',
'force-proxy-request-1.0': '1',
'mod_wsgi.application_group': 'findit.library.gwu.edu|',
'mod_wsgi.callable_object': 'application',
'mod_wsgi.enable_sendfile': '0',
'mod_wsgi.handler_script': '',
'mod_wsgi.input_chunked': '0',
'mod_wsgi.listener_host': '',
'mod_wsgi.listener_port': '443',
'mod_wsgi.process_group': 'findit.library.gwu.edu',
'mod_wsgi.queue_start': '1511504665230282',
'mod_wsgi.request_handler': 'wsgi-script',
'mod_wsgi.script_reloading': '1',
'mod_wsgi.version': (3, 4),
'proxy-nokeepalive': '1',
'wsgi.errors': <mod_wsgi.Log object at 0x7f0d131969b0>,
'wsgi.file_wrapper': <built-in method file_wrapper of mod_wsgi.Adapter object at 0x7f0d13280378>,
'wsgi.input': <mod_wsgi.Input object at 0x7f0d20047670>,
'wsgi.multiprocess': True,
'wsgi.multithread': True,
'wsgi.run_once': False,
'wsgi.url_scheme': 'https',
'wsgi.version': (1, 0)}>
Those errors are usually caused by Unicode errors in Georgetown records which were imported into WRLC Voyager. The problem often is visible in the WRLC Catalog also.
It's possible to manually correct them (Mike would occasionally do this), but it's a longstanding issue. It's possible with the exports they're having to do from Sierra for Alma that they've found some better options for avoiding this, I'm not sure.