python-bibtexparser `align_values` has minimum `_max_field_width ` due to `ENTRYPOINT` pseudo field.

`align_values` has minimum `_max_field_width ` due to `ENTRYPOINT` pseudo field.

Open michaelfruth opened this issue 3 years ago • 3 comments

trafficstars

When using align_values, the minimum _max_field_width is always 9 as ENTRYPOINT (9 chars) is in the list of entries: https://github.com/sciunto-org/python-bibtexparser/blob/006f463a8601dc8065f02f3199210c70f24ac4f9/bibtexparser/bwriter.py#L108

Therefore, this tests works fine:

    def test_align2(self):
        bib_database = BibDatabase()
        bib_database.entries = [{'ID': 'abc123',
                                 'ENTRYTYPE': 'book',
                                 'abc': 'test'}]
        writer = BibTexWriter()
        writer.align_values = True
        result = bibtexparser.dumps(bib_database, writer)
        expected = \
"""@book{abc123,
 abc       = {test}
}
"""
        self.assertEqual(result, expected)

Shouldn't the result be:

    def test_align2(self):
        bib_database = BibDatabase()
        bib_database.entries = [{'ID': 'abc123',
                                 'ENTRYTYPE': 'book',
                                 'abc': 'test'}]
        writer = BibTexWriter()
        writer.align_values = True
        result = bibtexparser.dumps(bib_database, writer)
        expected = \
"""@book{abc123,
 abc = {test}
}
"""
        self.assertEqual(result, expected)

This is also the behavior when having multiple short named keys.

Aug 09 '22 20:08 michaelfruth

I've successfully reproduced this. Thanks for the nice example. I also do not think this is intended.

Fixing this might have some consequences for users who generate files and have so far relied on this min width - a fix would suddenly change bib files with quite large git diffs. However, I think that for most use cases, the width will be larger than 9 and thus ok.

Ideally, we would change the function as follows: align_values can be a bool or an int. If it's a bool, we use the (fixed) version of the current logic (width = max field length). If its an int, we use that as width. In the latter case, should there be any len(field_key) > width, we could just have this one field key "overflow", with all other ones still being aligned. For huge bib files, with only few large keys, that would be optimal.

What do you think about this? Also, would you be interested on providing a fix or the extended functionality?

Aug 16 '22 12:08 MiWeiss

Yes, this sounds good and I'll provide a PR for this functionality.

Aug 21 '22 14:08 michaelfruth

Awesome, thanks so much!

Aug 22 '22 06:08 MiWeiss

python-bibtexparser python-bibtexparser copied to clipboard

`align_values` has minimum `_max_field_width ` due to `ENTRYPOINT` pseudo field.

python-bibtexparser
python-bibtexparser copied to clipboard