pypdf MAINT: Use consistent method keywords

MAINT: Use consistent method keywords

Open mtd91429 opened this issue 1 year ago • 4 comments

Explanation

As PyPDF2 has evolved, some of the internal naming conventions have not been consistent. One recently addressed issue was the outline/bookmark component; I think that was the largest example. As I have been exploring the code base, I have noticed others. I propose that these keywords be systematically implemented across the library and that the deprecation process utilizes an internally consistent process.

Below are the list of keywords that I can see which have inconsistent implementation (only including non-deprecated methods):

`pagenum`, `page_number`, `pageNumber`

pagenum is used in:
- PdfWriter.addBookmark
- PdfWriter.add_outline_item
- PdfWriter.add_named_destination
- PdfWriter.add_uri
- PdfWriter.add_link
pageNumber is used in:
- PdfWriter.get_page
page_number is used in:
- PdfWriter.add_annotation
- PdfWriter.get_page

`pagedest` and `dest`

pagedest is used in:
- PdfWriter.add_link
dest is used in:
- PdfWriter.add_outline_item_destination
- PdfWriter.add_named_destination_object

`indirect_ref` and `ido`

indirect_ref is used in:
- PdfReader._get_page_number_by_indirect
- PdfReader._flatten
indirect_reference is used in:
- PdfReader._get_object_from_stream
ido is used in:
- PdfWriter.get_object

`pwd` and `password`

password is used in:
- PdfReader.decrypt
pwd is used in:
- PdfWriter.encrypt (as user_pwd, owner_pwd)

Extrinsically inconsistent methods

There are some deprecated methods with keywords that have been changed in the updated function. For example, PdfWriter.addAttachment uses fname and PdfWriter.add_attachment uses filename or PdfWriter.removeImages uses ignoreByteStringObject and PdfWriter.remove_images uses ignore_byte_string_object. These changes are silent.

Intrinsically inconsistent methods

There are some methods which use keywords that are both snake_case and "smooshed" case. For example, PdfReader._read_xref_tables_and_trailers has both startxref and xref_issue_nr,

Inconsistent deprecation implementation

Some of the keywords have been deprecated within the methods. For example, PdfWriter.get_page. Whereas other keywords have been deprecated via decorator; for example, PdfWriter.add_outline_item_dict.

Concluding remarks (for this long winded post)

I don't think all of these examples necessarily need to be changed; I've included them to be as thorough as possible. However, in my opinion, snake_case is best and explicit is better than implied via abbreviation (usually). Please list additional examples that you see and any proposed approaches.

Jul 31 '22 01:07 mtd91429

Wow, amazing work! 😮 Thank you so much 👏👏🤗

I completely agree with your concluding remarks!

Jul 31 '22 01:07 MartinThoma

I would completely ignore already deprecated functions / parameters. Think of them as no longer being present in the library. You're seeing a ghost of the past there :-)

Jul 31 '22 01:07 MartinThoma

I have another one: Instead of page_number or start_page_number or similar, PdfMerger.merge uses position

Aug 19 '22 04:08 MartinThoma

I have another one: Instead of page_number or start_page_number or similar, PdfMerger.merge uses position

Agree, Page_number should be prefered

Aug 19 '22 09:08 pubpub-zz

startxref comes from the PDF standard and thus is consistent. The rest was adjusted :-)

Dec 10 '22 17:12 MartinThoma

pypdf pypdf copied to clipboard

MAINT: Use consistent method keywords

Explanation

pagenum, page_number, pageNumber

pagedest and dest

indirect_ref and ido

pwd and password

Extrinsically inconsistent methods

Intrinsically inconsistent methods

Inconsistent deprecation implementation

Concluding remarks (for this long winded post)

pypdf
pypdf copied to clipboard

`pagenum`, `page_number`, `pageNumber`

`pagedest` and `dest`

`indirect_ref` and `ido`

`pwd` and `password`