python-bibtexparser icon indicating copy to clipboard operation
python-bibtexparser copied to clipboard

Review homogeneize_latex_encoding()

Open sciunto opened this issue 11 years ago • 7 comments

homogeneize_latex_encoding has several issues:

  • Protect accent should act only on certain fields
  • we should check carefully that accent are correctly coded

sciunto avatar Oct 01 '13 19:10 sciunto

I have an issue with homogeneize_latex_encoding().

One of my article is called something_2010 and it get escaped as something\_2010 using homogeneize_latex_encoding. But such an ident is not working in Latex (error on the \cite line).

Did I miss something ?

Phyks avatar Apr 26 '14 16:04 Phyks

No, you don't. It's because homogeneize_latex_encoding() does not distinguish data and metadata (point 1). I'm working on a partial fix.

sciunto avatar Apr 26 '14 17:04 sciunto

Another question about this.

I'm using a custom Bibtex field called "file" to store filenames to pdf files of my articles. If I use homogeneize_latex_encoding(), the special characters (such as _) in my filename get escaped and I can't use it in my python scripts.

However, I'd rather have an homogeneous latex encoding in the bibtex file I write.

So, maybe it could be a good idea to implement also a unhomogeneize_latex_encoding function, or (easier I think), to be able to specify the customization at writing time.

What do you think of this ?

Phyks avatar May 11 '14 01:05 Phyks

I do not think it's a good way to go to write a unhomogeneize_latex_encoding function. It would not be natural to do so from the user point of view.

However, homogeneize_latex_encoding() must have an extra optional argument, a dict, to specify either:

  • the record treated by string_to_latex (title, author, abstract)
  • or NOT treated.

For now, I'm in favor of the solution 1, probably shorter and a default list might work for a broader range of usecases.

sciunto avatar May 11 '14 07:05 sciunto

I agree with the unhomogeneize_latex_encoding function.

But maybe homogeneize_latex_encoding could be called when writing with bwriter ? This is possible for now, by explicitly calling it before writing. But I think maybe the writing functions could have a customizations param just as the reader have ?

And I'm in favor of an extra optional argument with default fields to treat as well.

Phyks avatar May 11 '14 13:05 Phyks

OK for the extra arg in homogeneize_latex_encoding.

I do not understand why you need a customization callback in bwriter functions? What prevents you to pass your customization functions in the parser itself?

sciunto avatar May 11 '14 16:05 sciunto

Actually, yes I can pass it myself. I was just thinking that a symmetric behaviour of the reader and writer with the customization option could be nice. But it may be a stupid idea…

Phyks avatar May 11 '14 16:05 Phyks

Issue is 10 years old, much has changed since. Should similar problems still pop up, it would probably be best to open a new issue.

MiWeiss avatar Aug 17 '22 19:08 MiWeiss