bibstuff
bibstuff copied to clipboard
sphinxext: global issue with UTF-8 support for files actually having non-ascii characters?
I thought originally it was of a failure to support 'latex' (opposite to UTF8) encoded .bib files resulting in crash:
File "/usr/lib/pymodules/python2.6/simpleparse/dispatchprocessor.py", line 120, in lines
return countlines (buffer[start or 0:end or len(buffer)])
File "/usr/lib/pymodules/python2.6/simpleparse/stt/TextTools/TextTools.py", line 467, in countlines
return len(tag(text, linecount_table)[1])
TypeError: Low-level command (41) argument in entry 2 couldn't be converted to a string object, is a unicode
neither setting
:encoding: iso-8859-1
for biblisted nor
% Encoding: latex
in the header of .bib helped to resolve. actually converting .bib file to utf-8 (using kbibtex), removing above coding settings lead to the same failure :-/
only using matthew_brett.bib, without any UTF8 per se succeded. Adding an insulting unicode russian е instead of proper ascii e in the name of the respectful author of the first entry, did not result in the above crash unfortunately but at least obscured the authors name to become "Matthew Br." when using jasss_style
Yes, sadly, simpleparse does not probably will never support unicode. I've since written two unicode supporting bibtex parsers, and I've been talking to Andrey Golovizin, the author of pybtex, who's got a long way to bibtex compatibility using pure python. So, probably the fix here would be dumping bibtools and doing a rewrite. Is it something you have urgent need of?
Hey Matthew,
excellent work! I just started playing with it. I noticed that the same thing:
Exception occurred:
File "/usr/lib/pymodules/python2.6/simpleparse/stt/TextTools/TextTools.py", line 467, in countlines
return len(tag(text, linecount_table)[1])
TypeError: Low-level command (41) argument in entry 2 couldn't be converted to a string object, is a unicode
happens when there is a comment in the BIB file. In my case this one:
@Comment{x-kbibtex-encoding=utf-8}
After its removal I get perfect results.
Best,
Michael
nice finding ;-) it seems that any kind of @comment ruins it
Guys,
I'm afraid bibtools has a very fast parser that is fragile and essentially impractical to fix. I've written slower parsers that are much more like bibtex in their behavior, but dropping a new parser in would take a few days of work. You're voting for the few days I guess?
What about using http://pybtex.sourceforge.net/ for all parsing -- supports UTF8 and few other exotic reference formats (YAML, BibTeXML). It lacks any formatting output for ReST ATM though, but seems to be quite nice and somewhat active project
bloody buttons -- how to reopen it? I clicked 'Comment & Close' by mistake ;)
I think 'Actions - Open' opens it again. I've been talking to the pybtex guy - Andrey Golovizin - result above. He tried one of my new parsers and then wrote his own in rapid order that is indeed reasonably fast and good a running through errors. The problem is that pybtex has two modes. One is 'bibtex mode' - and for that Andrey uses the bibtex .bst files and a parser for the bst language. That mode only outputs latex - because that's what the bst files output. Then there's python mode. Python mode outputs html and latex, but only has a single 'unsrt' style, which is still incomplete - for example it doesn't deal with conference papers yet as I remember, and is more fragile (requires entries in the citation that bibtex will allow to be empty). So, it would be some (useful) work to make a fairly useful rst output from pybtex.
Hi @matthew-brett ,
I wondered if you had a chance to dig into this one again? thought to make use of bibstuff sphinx extension again but forgot about this little show stopper. Cheers!
Sorry - no - I hadn't - it seemed hopeless.
Have you tried sphinxcontrib-bibtex? I was thinking of switching to that (but it may still lack the functionality to output a given list of references, specified in the bibliography).
https://github.com/mcmtroffaes/sphinxcontrib-bibtex
Issue here: https://github.com/mcmtroffaes/sphinxcontrib-bibtex/issues/54
On 5/27/15, Matthew Brett [email protected] wrote:
Sorry - no - I hadn't - it seemed hopeless.
Have you tried sphinxcontrib-bibtex? I was thinking of switching to that (but it may still lack the functionality to output a given list of references, specified in the bibliography).
https://github.com/mcmtroffaes/sphinxcontrib-bibtex