shopify_python
shopify_python copied to clipboard
Should private or protected methods be preferred?
When defining class methods, we can hint that a function is protected by prefacing it with _
or we can invoke name-mangling by prefacing it with __
. When there isn't an obvious reason why someone would want to override a method but there's also not an obvious reason why someone should not be able to, which should be our default?
For a more specific example, see here.
@erikwright @JasonMWhite
I am strongly in favour of composition over inheritance. The upshot of that is that I don't really see a place for inheritance (apart from pure-virtual / interface inheritance). If you follow that, there's not really a role for "protected" methods.
Another reason people use them is because they want to be able to test those methods.
I also don't believe in testing private methods (or methods that would be private if they weren't exposed for testing purposes). I consider it a code smell. If the full implementation of a class cannot be exercised via its public methods you probably need to extract some helper methods or some other classes.
In terms of "not an obvious reason why someone should not be able to" override a method, keep in mind that (1) they can not only override it but also invoke it and (2) if you don't consider it part of the API you are unlikely to have thought about all the ways that it could be invoked or how a user might override it and whether your code is resilient to all of them. In summary, you should assume that things will break if someone invokes/overrides it. There's a cost to allowing it to be part of your API surface.
With all of that in mind, I am in favour of double-underscores for any method or property that is not part of the public API of a class.
CC @honkfestival on these discussions too
The upshot of that is that I don't really see a place for inheritance (apart from pure-virtual / interface inheritance). If you follow that, there's not really a role for "protected" methods.
💯
This is an argumentation in favor of using single underscore for declaring internal attributes (methods or variables).
1. PEP8 specify exactly this:
Use one leading underscore only for non-public methods and instance variables.
...
Even with __all__ set appropriately, internal interfaces
(packages, modules, classes, functions, attributes or other names)
should still be prefixed with a single leading underscore.
https://www.python.org/dev/peps/pep-0008/#public-and-internal-interfaces
2. The world warns us not to use name mangling to emulate private attributes
To avoid name clashes with subclasses, use two leading underscores to invoke
Python's name mangling rules.
https://www.python.org/dev/peps/pep-0008/#method-names-and-instance-variables
Python's runtime does not restrict access to such members, the mangling only
prevents name collisions if a derived class defines a member with the same name.
https://en.wikipedia.org/wiki/Name_mangling#Python
Or https://softwareengineering.stackexchange.com/a/229807 Or http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#naming
3. This is the Python culture
This whole stackoverflow response is great: http://stackoverflow.com/a/7456865
For a Pythonista, encapsulation is not the inability of seeing internals of classes,
but the possibility of avoiding to look at it. I mean, encapsulation is the property
of a component which allows it to be used without the user being concerned about the
internal details. If you can use a component without bothering yourself about its
implementation, then it is encapsulated (in the opinion of a Python programmer).
4. Google also has a large codebase and still doesn't require it
The Google Python style guide is a great addition to PEP8. It adds some very legitimate guidelines on PEP8 inaccuracies to keep a codebase readable by everyone and maintainable at scale.
5. We should code in Python rather than MyCompanyPython
Basing the Shopify Python Style on Google Python style guide is the best thing to do.
It means:
- We are adopting the generally known breed of Python, the one everyone knows.
- That our Python won't look different that most of the libraries we use.
- We can hire Python devs and we won't need to teach them MyCompanyPython
Python is evolving to meet the needs of more sophisticated codebases. Type annotations are a great example of this.
Our objective in establishing a style guide is to innovate on the way Python is written at Shopify. While consistency with external practices is a good place to start, adherence to those practices should not override forward progress. Nor should a founding-fathers argument that the language should be used exactly as it was originally intended.
Double-underscores force developers to think about how to design their classes to be both well factored / encapsulated and well tested. It's a useful mechanism, and the best available mechanism, for implementing a concept that is demonstrably useful in large codebases.
With regards to Google and their own style:
- The publicly published style-guide is an adaptation of the internal style-guide
- Just as we use the style-guide "with amendments", it is also used internally at Google "with amendments"
- Not every Google project is large-scale, they have many, many, many, smaller projects. Their formal style guides represent only what is universal to all of their projects.
In order to have a fruitful discussion about a particular style guideline, let's discuss how it might be helpful or harmful in a Shopify-specific context.
Note: Shopify's python standards are almost completely, with only a few exceptions, the Google python standard. We explicitly incorporate them by reference, and have implemented several custom pylint rules to catch some of them.
On the single/double underscore question, I have found double-underscores useful in designing good classes, thinking consciously about what should be inheritable/overridable. I find them inconvenient when debugging.
In simple cases, where a class is not going to be inherited, there doesn't seem to be any advantage in going with double-underscores.
In order to have a fruitful discussion about a particular style guideline, let's discuss how it might be helpful or harmful in a Shopify-specific context.
That was my intent, I guess my comment was too long... I wanted to make sure that readers of this Issue understand the Python ecosystem. It's a major aspect of the discussion since Shopify has chosen to use Python mostly because of its libraries (if I understand correctly).
Following best practices, like jumping on type-hinting is clearly must-do. But the use of double underscore is actually the past:
Google has apparently only one internal amendment to the "Naming" section, which discourage using name mangling.
Facebook is just following flake8
religiously.
None of the important dependencies of starscream is doing this.
What you want exists, it's the simple underscore, and the tooling to enforce this is already here: https://github.com/PyCQA/pylint/blob/ecd9d9e7db0d67db0691b1bbc869d82b915bc988/pylint/checkers/classes.py#L334-L338
************* Module poi.__main__
W: 4, 6: Access to a protected member _internal of a client class (protected-access)