python-bibtexparser icon indicating copy to clipboard operation
python-bibtexparser copied to clipboard

`align_values` has minimum `_max_field_width ` due to `ENTRYPOINT` pseudo field.

Open michaelfruth opened this issue 3 years ago • 3 comments
trafficstars

When using align_values, the minimum _max_field_width is always 9 as ENTRYPOINT (9 chars) is in the list of entries: https://github.com/sciunto-org/python-bibtexparser/blob/006f463a8601dc8065f02f3199210c70f24ac4f9/bibtexparser/bwriter.py#L108

Therefore, this tests works fine:

    def test_align2(self):
        bib_database = BibDatabase()
        bib_database.entries = [{'ID': 'abc123',
                                 'ENTRYTYPE': 'book',
                                 'abc': 'test'}]
        writer = BibTexWriter()
        writer.align_values = True
        result = bibtexparser.dumps(bib_database, writer)
        expected = \
"""@book{abc123,
 abc       = {test}
}
"""
        self.assertEqual(result, expected)

Shouldn't the result be:

    def test_align2(self):
        bib_database = BibDatabase()
        bib_database.entries = [{'ID': 'abc123',
                                 'ENTRYTYPE': 'book',
                                 'abc': 'test'}]
        writer = BibTexWriter()
        writer.align_values = True
        result = bibtexparser.dumps(bib_database, writer)
        expected = \
"""@book{abc123,
 abc = {test}
}
"""
        self.assertEqual(result, expected)

This is also the behavior when having multiple short named keys.

michaelfruth avatar Aug 09 '22 20:08 michaelfruth

I've successfully reproduced this. Thanks for the nice example. I also do not think this is intended.

Fixing this might have some consequences for users who generate files and have so far relied on this min width - a fix would suddenly change bib files with quite large git diffs. However, I think that for most use cases, the width will be larger than 9 and thus ok.

Ideally, we would change the function as follows: align_values can be a bool or an int. If it's a bool, we use the (fixed) version of the current logic (width = max field length). If its an int, we use that as width. In the latter case, should there be any len(field_key) > width, we could just have this one field key "overflow", with all other ones still being aligned. For huge bib files, with only few large keys, that would be optimal.

What do you think about this? Also, would you be interested on providing a fix or the extended functionality?

MiWeiss avatar Aug 16 '22 12:08 MiWeiss

Yes, this sounds good and I'll provide a PR for this functionality.

michaelfruth avatar Aug 21 '22 14:08 michaelfruth

Awesome, thanks so much!

MiWeiss avatar Aug 22 '22 06:08 MiWeiss