PyBaMM icon indicating copy to clipboard operation
PyBaMM copied to clipboard

refactor: migrate citations.py from pybtex to bibtexparser

Open gurukiran7 opened this issue 8 months ago • 16 comments

Description

This PR refactors pybamm/citations.py to fully remove the legacy pybtex dependency and migrate all citation management to bibtexparser (v2). The new implementation:

  • Uses bibtexparser for reading, parsing, and validating BibTeX entries
  • Updates the register method to handle both known keys and raw BibTeX strings
  • Adds improved error handling and type validation
  • Implements custom citation string formatting to closely match the previous output
  • Issues warnings when overwriting existing citations with different content

Fixes #3648

Important checks:

Please confirm the following before marking the PR as ready for review:

  • No style issues: nox -s pre-commit
  • All tests pass: nox -s tests
  • The documentation builds: nox -s doctests
  • Code is commented for hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

gurukiran7 avatar Apr 15 '25 08:04 gurukiran7

Thanks for starting this, @gurukiran7! To fix the tests, you'll need to add bibtexparser as in the [cite] set of optional dependencies in pyproject.toml. You may also run the tests locally to see if they pass and ease your development, as we'll need to trigger the workflows every time you push a commit: https://docs.pybamm.org/en/latest/source/user_guide/contributing.html#testing.

agriyakhetarpal avatar Apr 15 '25 10:04 agriyakhetarpal

@agriyakhetarpal I've updated the tests to align with the new citations.py implementation. At the moment, only a few tests are passing I'm still working on the rest. I want your feedback on the implementation so far

gurukiran7 avatar Apr 15 '25 15:04 gurukiran7

I think the implementation is looking good, so far. Thanks! One important thing here will be to retain the same output from pybamm.print_citations() as the previous implementation, or if it isn't possible, to stay as close to it. Please continue to work on the rest of the tests.

agriyakhetarpal avatar Apr 16 '25 09:04 agriyakhetarpal

It looks like a stable release for bibtexparser v2 is not yet available, and the resolver picks up pre-release versions only if we specify them. Could you please pin to v2.0.0b8 and see if that works?

agriyakhetarpal avatar Apr 16 '25 09:04 agriyakhetarpal

It looks like a stable release for bibtexparser v2 is not yet available, and the resolver picks up pre-release versions only if we specify them. Could you please pin to v2.0.0b8 and see if that works?

done

gurukiran7 avatar Apr 16 '25 13:04 gurukiran7

@agriyakhetarpal The test_citations test seems to be failing because read_citations() isn't populating _all_citations, so keys like "Sulzer2021" are missing during assertions. I suspect this might be due to the CITATIONS.bib file not loading correctly in the test environment. Could you please take a look and let me know if I’m missing something?

gurukiran7 avatar Apr 16 '25 13:04 gurukiran7

🤔 The CI logs report a different error: ImportError: cannot import name 'BibDatabase' from 'bibtexparser' (/home/runner/work/PyBaMM/PyBaMM/.nox/unit/lib/python3.11/site-packages/bibtexparser/__init__.py), is it related to what you're seeing locally?

agriyakhetarpal avatar Apr 16 '25 13:04 agriyakhetarpal

Actually the current error I'm seeing is an AssertionError, not an ImportError. It's happening because _all_citations is empty, so the test fails when it checks for "Sulzer2021".

gurukiran7 avatar Apr 16 '25 14:04 gurukiran7

Yes, that's alright, but the CI logs still don't reflect it – it is hence difficult to see all the failures. I pulled your branch to my machine, and I see the same error: ImportError: cannot import name 'BibDatabase' from 'bibtexparser'. I think there might be a few commits on your local branch you haven't pushed?

https://github.com/search?q=repo:sciunto-org/python-bibtexparser+%22bibdatabase%22&type=code also reports no results, so I'm not sure where you're getting that attribute from.

It's not in the documentation either: https://bibtexparser.readthedocs.io/en/main/search.html?q=bibdatabase

agriyakhetarpal avatar Apr 16 '25 17:04 agriyakhetarpal

Thanks for pointing that out! the ImportError stems from my use of BibDatabase, which was part of bibtexparser v1 but has been removed in v2. I mistakenly mixed v1-style imports with the newer v2 API. I’ve now updated the code to use Entry from bibtexparser.model instead and removed all references to BibDatabase. That should fix the import issue I’ll push the changes shortly.

gurukiran7 avatar Apr 16 '25 17:04 gurukiran7

Hi @agriyakhetarpal , I've pushed the updated code, and most methods and their tests are passing as expected However, there's an issue with the register method, causing tests like test_citations and test_overwrite_citation to fail. Despite debugging, I couldn't resolve the issue, as it seems related to how citations are added to _all_citations and _papers_to_cite. could u pls review the register method and suggest a fix?

gurukiran7 avatar Apr 17 '25 10:04 gurukiran7

Hi Gurukiran, I've triggered the workflows and I'm still not sure – the CI logs show otherwise? The tests are failing to load:

ImportError: cannot import name 'UndefinedString' from 'bibtexparser' (/home/runner/work/PyBaMM/PyBaMM/.nox/coverage/lib/python3.12/site-packages/bibtexparser/__init__.py)

May I know the steps you're performing to run them locally, so that I can take a look at the register() problem?

Also, I'd like to note that we shouldn't use bibtexparser v1 here. The README at https://github.com/sciunto-org/python-bibtexparser#python-bibtexparser-v2 states:

Note that all development and maintenance effort is focussed on v2. Small PRs for v1 are still accepted, but only as long as they are backwards compatible and don't introduce much additional technical debt. Development of version one happens on the dedicated v1 branch.

which is code-speak to say that bibtexparser v1 will be deprecated in the near future in favour of bibtexparser v2.

This is a problem for three reasons:

  • we've pinned down to a specific package version, which isn't recommended when trying to use PyBaMM as a library to integrate with other libraries (this is fine if you use it as an application). We do this only in extreme cases where it's needed for stability (#4945)
  • we will have no guarantees whether bibtexparser v1 will receive sufficient updates over time to support newer Python versions, and this is the same problem we have with pybtex right now – it doesn't support Python 3.13 natively and is unmaintained.
  • if there are any bugs in bibtexparser v1's code, it makes it difficult for us to receive fixes for them as they can take a long time to land in 1.x releases and also need to conform to bibtexparser v1's BC (backwards compatibility) guarantees.

agriyakhetarpal avatar Apr 17 '25 10:04 agriyakhetarpal

I switched back to bibtexparser==2.0.0b8, I have pushed the updated code. I run the tests using pytest. I also verify specific tests by running: pytest -k test_citations -v Please let me know if there are any additional steps you'd recommend

gurukiran7 avatar Apr 17 '25 12:04 gurukiran7

Thanks! Could you please run the whole test suite using nox directly? Running pytest -k test_citations is fine locally, but we're running all the tests in CI and that's different from running these tests in specific as you may see.

Also, you would need to fix the style job to allow the test suite to run.

agriyakhetarpal avatar Apr 17 '25 13:04 agriyakhetarpal

@agriyakhetarpal I've pushed the latest updated code addressing most of the test failures and feedback

Refactored the register method and _add_citation to support bibtexparser v2+ Made citation handling optional using import_optional_dependency Cleaned up all formatting/style issues (ruff-format, pre-commit now passes) if you have any suggestions on expected behavior or test intentn pls let me know

gurukiran7 avatar Apr 18 '25 04:04 gurukiran7

@agriyakhetarpal , @kratman , I've pushed the latest code please review when you get a chance

gurukiran7 avatar Apr 21 '25 16:04 gurukiran7

Closing this for now since it is not needed and was still not working

kratman avatar Jul 01 '25 18:07 kratman