biopython icon indicating copy to clipboard operation
biopython copied to clipboard

PDBList.py update_pdb does not consider multiple file formats (in some parts...)

Open JoshuaMeyers opened this issue 6 years ago • 6 comments

Setup

I am reporting a problem with Biopython version, Python version, and operating system as follows:

import sys; print(sys.version)
import platform; print(platform.python_implementation()); print(platform.platform())
import Bio; print(Bio.__version__)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
CPython
Darwin-16.7.0-x86_64-i386-64bit
1.71

The PDBList.py method update_pdb is extremely useful, but just requires some minor updating!

Expected behaviour

  1. It should move obsoleted pdb files to the specified path
  2. Currently the function does not return anything

Actual behaviour

  1. The part of the function which moves obsolete pdb entries has not kept up with the change to accept multiple file_format's. Therefore it checks only for pdb{pdb_code}.ent and returns Obsolete file [...]PDB/tt/pdb5tti.ent is missing (even if .ent isn't the specified file_format)
  2. It would also be useful to return the updates that have been applied (to provide a diff)

Steps to reproduce

pdbl = PDBList(server='ftp://ftp.wwpdb.org', pdb='/PDB', obsolete_pdb='/PDB/obsolete', verbose=True) pdbl.update_pdb()

JoshuaMeyers avatar Jun 18 '18 15:06 JoshuaMeyers

@JoshuaMeyers can you see how to fix this, and submit a pull request please?

If not, @dadoskawina would you like to work on this as a followup to your contributions in #943?

Thank you both.

peterjc avatar Jun 19 '18 07:06 peterjc

Hi!

If this hasn't been fixed yet, I can tackle it. We'll see how it goes.

Cheers,

poleshe avatar Jan 17 '22 18:01 poleshe

Yes please @poleshe - please start with re-testing to confirm it the problem is still there.

(edited to fix autocorrect typo)

peterjc avatar Jan 18 '22 09:01 peterjc

Greetings, I can confirm this still happens. update_pdb is still only searching for ".ent" files, even after passing a "file_format" attribute.

pdbl.update_pdb(file_format="xml")

pdb

missing

Will develop a fix for this.

Have a nice weekend!

poleshe avatar Jan 21 '22 23:01 poleshe

Thanks @pycreatine :)

poleshe avatar Apr 23 '23 19:04 poleshe

Fixed by #4288:

image image image

No format (default) image image image

All other formats work as expected as well.

Observed during testing, when using an invalid format:

image image

Perhaps worth to open another issue to raise this with a proper traceback.

Thanks!

poleshe avatar May 22 '23 08:05 poleshe