doped icon indicating copy to clipboard operation
doped copied to clipboard

Adding QE support to doped.

Open Walser52 opened this issue 5 months ago • 5 comments

The following pull request adds support for Quantum Espresso to doped.

For the most part the main code remains unchanged. However the code for QE can now act as a template for adding support for future codes. The process of adding support will be greatly accelerated if there is a way to convert the output files generated by your code of choice (SIESTA, WIEN2k) to a pyamten VaspRun object. In this case it was helpful that that pymatgen.io.espresso already existed.

Major Changes

Parsers

DefectsParser and DefectParser are still the user end but now decide which code specific class to invoke. The code below for instance triggers the DefectsParserVASP class which then invokves DefectParserVASP.

from doped.analysis import DefectsParser 
calc_root = "doped/examples/CdTe/v_Cd_example_data"
dp = DefectsParser(code = 'vasp',
                   output_path = calc_root, 
                   dielectric=25)  # dielectric needed for charge corrections

For Quantum Espresso, the structure is slightly different. You have to specify the pseudopotential folder too.

calc_root = "doped/examples/CsPbI3-sc-defect"
pp_folder = "doped/examples/CsPbI3-sc-defect/PP"

import warnings
warnings.filterwarnings("ignore")

dp = DefectsParser(code = 'espresso',
                   output_path = calc_root, 
                   dielectric=25, 
                   occu_tol = 0.004,
                   pp_folder = pp_folder
                  )  

DefectsParserEspresso inherits from its VASP counterpart. While I have added BaseDefectsParser as an abstract class, I haven't added functionality behind it. Something that we could consider so that every code has a highly modular template which it ought to follow.

The current implementation necessitates the use of a cube file for the local potentials. This cube file can be generated using pp.x. A sample folder structure is in the examples folder.

Folder Handling

I haven't messed with the DefectsParserVASP's folder handling but I did make a FolderHandler for DefectsParserEspresso. If it's acceptable, that structure could be ported to the defects parser for VASP.

RunParser

Similar to DefectsParser, there is a RunParser. It seems logical to have every code specific parsing routines into a similar class.

Potential Issues

Corrections

The computation of Kumagai correction needs cross-checking. As far as I know, QE does not list core site potentials explicitly like VASP does.

File naming

The example I've added uses espresso.xml as the file name. I'm working on it so that ANY file with xml as the extension would do.

Walser52 avatar Aug 02 '25 06:08 Walser52

Walkthrough

This update introduces Quantum Espresso (QE) support to the parsing utilities, including a new parser class, methods for handling QE output and pseudopotential files, and a factory for selecting the appropriate backend. Supporting utilities for patching read-only QE parser properties are added. Minor changes and a new example input file are also included.

Changes

Cohort / File(s) Change Summary
QE Parsing Backend & Utilities
doped/utils/parsing.py, doped/utils/qehacks.py
Adds QE support: new RunParser factory, RunParserEspresso class for parsing QE outputs, UPF pseudopotential handling, core potential extraction, and file warning utilities. Introduces PymatgenEspressoHacks for patching read-only properties in PWxml.
VASP Parsing & Utilities Minor Updates
doped/utils/eigenvalues.py
Refactors to avoid redundant function calls in band_edge_properties_from_vasprun by caching results in local variables.
Thermodynamics Formatting
doped/thermodynamics.py
Adds a blank line for formatting after stacking arrays in _parse_transition_levels. No logic changes.
VASP NELECT Debug Print
doped/vasp.py
Adds a print statement in DefectDictSet.nelect property to output the neutral electron count before adjustment.
Example QE Input File
examples/CsPbI3-sc-defect/CsPbI3_Co_0/espresso_std/av.in
Adds a new QE input configuration file (av.in) specifying calculation parameters in namelist format.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant RunParser
    participant RunParserEspresso
    participant PWxml
    participant UPF

    User->>RunParser: get_run(path, code='espresso')
    RunParser->>RunParserEspresso: get_run(path)
    RunParserEspresso->>PWxml: Parse espresso.xml
    RunParserEspresso->>UPF: Extract valence charges
    RunParserEspresso-->>User: Parsed structure, eigenvalues, band edges, etc.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

In the warren, code grows wide,
With Espresso now parsed on the side.
Pseudopotentials found,
Core potentials abound,
A rabbit’s delight—
QE and VASP unite—
Hopping through data with pride! 🐇✨

[!NOTE]

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • [ ] 📝 Generate Docstrings
🧪 Generate unit tests
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot] avatar Aug 02 '25 06:08 coderabbitai[bot]

Thanks for this @Walser52! I am a bit swamped this week but will aim to have time to go through properly soon. On this point:

The computation of Kumagai correction needs cross-checking. As far as I know, QE does not list core site potentials explicitly like VASP does.

Is there any example in the literature / online of someone using the eFNV/Kumagai correction with QE?

kavanase avatar Aug 04 '25 17:08 kavanase

Take your time. I started giving my complete data and noticed a few bugs but they are confined to the bits that I implemented. Hopefully will keep ironing those out over the next month or so.

I haven't gone through the literature but I asked on the QE mailing list and didn't get any response.

Walser52 avatar Aug 04 '25 17:08 Walser52

Hi @Walser52, Thank you again for pursuing this!

I gather this is still in draft mode, but some thoughts from me if helpful:

While I have added BaseDefectsParser as an abstract class, I haven't added functionality behind it. Something that we could consider so that every code has a highly modular template which it ought to follow.

I agree and this suggestion for a highly modular template is very good. I tried implementing some of this modular structure in develop as noted in the response to this issue you raised in June (https://github.com/SMTG-Bham/doped/issues/128), but I see this PR was based off the main branch, and so I'm not sure if those refactored methods were used here – there seems to be a fair bit of duplication and large diffs at the moment with the draft state and branch merge conflict, so hard to properly review here. I have set the merge target now to be develop, would you be able to rebase to this and address merge conflicts?

For the introduced parameters:

  • Ideally code could be auto-determined from the file names within the output folder, with an error thrown if this is not possible, to reduce user burden.
  • For the pseudopotential folder requirement with the Espresso parser; is there a typical folder structure for this with QE? Or more a specific requirement here for determining the number of electrons? If the former, again could we default to assuming this folder structure for QE parsing if pp_folder is not provided, again to reduce user burden?

To note; many of the coderabbit suggestions can be ignored, but a few here are useful. I've marked these with "Good suggestion" in the comments above. Also this one:

2034-2034: Remove unnecessary parentheses from class definition.

If you get to push forward with this, let me know once it is helpful for me to review 😃 Thank you again!

kavanase avatar Oct 22 '25 21:10 kavanase

I've actually found a bug or two since. Haven't had a chance to push those. I'll try to make the fixes. A little new to this stuff so you'll have to bear with me :)

Yes, code should be auto-determined. And pp_folder could be too (unless it's provided).

Hi @Walser52, Thank you again for pursuing this!

I gather this is still in draft mode, but some thoughts from me if helpful:

While I have added BaseDefectsParser as an abstract class, I haven't added functionality behind it. Something that we could consider so that every code has a highly modular template which it ought to follow.

I agree and this suggestion for a highly modular template is very good. I tried implementing some of this modular structure in develop as noted in the response to this issue you raised in June (#128), but I see this PR was based off the main branch, and so I'm not sure if those refactored methods were used here – there seems to be a fair bit of duplication and large diffs at the moment with the draft state and branch merge conflict, so hard to properly review here. I have set the merge target now to be develop, would you be able to rebase to this and address merge conflicts?

For the introduced parameters:

* Ideally `code` could be auto-determined from the file names within the output folder, with an error thrown if this is not possible, to reduce user burden.

* For the pseudopotential folder requirement with the Espresso parser; is there a typical folder structure for this with QE? Or more a specific requirement here for determining the number of electrons? If the former, again could we default to assuming this folder structure for QE parsing if `pp_folder` is not provided, again to reduce user burden?

To note; many of the coderabbit suggestions can be ignored, but a few here are useful. I've marked these with "Good suggestion" in the comments above. Also this one:

2034-2034: Remove unnecessary parentheses from class definition.

If you get to push forward with this, let me know once it is helpful for me to review 😃 Thank you again!

Walser52 avatar Oct 23 '25 01:10 Walser52