pony
                                
                                 pony copied to clipboard
                                
                                    pony copied to clipboard
                            
                            
                            
                        Simple Enum Support
Abstract
This proposes support for normal type enums to be supported.
There would be no special tables added to model those inside the database. They would be implemented only on a pure python level. The database would simply store the value of those.
Previous implementation attempt, e.g. #346 and #392, needed to change the type of the field to store the names string, thus introducing an overhead and possible headaches like max length of a VARCHAR(x) and such.
Storing the underlaying values, e.g. int of an IntEnum allows it to use the same code used for int, as well as all related settings when defining them in models.
Introduction
In python Enums can have a native type associated with them.
The popular IntEnum (from enum import IntEnum) is defined as the following:
class IntEnum(int, Enum):  
    """Enum where members are also (and must be) ints"""
which would be typically be used as:
class FoobarEnum(IntEnum):
   MANGO = 1
   BANANA = 2
   ANSWER = 42
   SOMETHING = 123
   WORLD_DOMINATION = 666
   EASTEREGG = 4458
The Enum class now automatically makes all the attributes the same type as the Enum class.
So here that's int and Enum
Now, because FoobarEnum (and so FoobarEnum.MANGO, FoobarEnum.BANANA, etc.) are already valid int types:
>>> FoobarEnum.MANGO + FoobarEnum.SOMETHING
124
Those can already be used when writing to the database.
Having pony model
class Whatever(db.Entity):
    id = …
    enum_field = Required(int)
you can easily set that with the enum:
new_whatever = Whatever(enum_field=FoobarEnum.MANGO)
some_whatever = Whatever.get(enum_field=FoobarEnum.BANANA)
The problem here is they won't be converted back to that enum class:
>>> another_whatever.enum_field
4458
>>> some_enum
<FoobarEnum.EASTEREGG: 4458>
>>> isinstance(FoobarEnum.EASTEREGG, int)
True
>>> isinstance(FoobarEnum.EASTEREGG, FoobarEnum)
True
>>> isinstance(another_whatever.enum_field, int)
True
>>> isinstance(another_whatever.enum_field, FoobarEnum)
False
Proposal
The proposal is to store the underlying enum values, as those are basically already implemented, and all the corner cases in queries and such should be ironed out. So only the retrieval must have an additional step of loading the value in the right Enum type.
This is as easy as FoobarEnum(4458):
val = 4458
py_type = FoobarEnum
>>> py_type(val)
<FoobarEnum.EASTEREGG: 4458>
>>> py_type(val) == FoobarEnum.EASTEREGG
True
Implementation Approaches
Both share that the upgrade would be strait forward replacing a type with any fitting class being subclass of that original type (here: int) and implementing enum.Enum. An example would be enum.IntEnum:
class Whatever(db.Entity):
   id = …
-  enum_field = Required(int,        size=8, min=0, max=4458, nullable=True, default=IntEnum.ANSWER)
+  enum_field = Required(FoobarEnum, size=8, min=0, max=4458, nullable=True, default=IntEnum.ANSWER)
This can be monkey patched for testing purposes, like the following (click to expand)
from pony.orm.dbproviders.mysql import MySQLProvider
from pony.orm.dbproviders.sqlite import SQLiteProvider
from pony.orm.dbproviders.postgres import PGProvider
from pony.orm.dbproviders.oracle import OraProvider
MySQLProvider.converter_classes = [(Enum, EnumConverter)] + MySQLProvider.converter_classes
SQLiteProvider.converter_classes = [(Enum, EnumConverter)] + SQLiteProvider.converter_classes
PGProvider.converter_classes = [(Enum, EnumConverter)] + PGProvider.converter_classes
OraProvider.converter_classes = [(Enum, EnumConverter)] + OraProvider.converter_classes
Both work for enums with native types and the additional types supported by pony.
So class IntEnum(int, Enum), class StrEnum(str, Enum), class BoolEnum(bool, Enum).
Even more unusual enum types like class DateEnum(datetime.datetime, Enum), etc. would work.
Implementation Approach A
from pony.orm.dbapiprovider import IntConverter, StrConverter, Converter
from enum import Enum
from typing import Type
class EnumConverter(Converter):
    @staticmethod
    def _get_real_converter(py_type) -> Type[Converter]:
        """
        Gets a converter for the underlying type.
        :return:
        """
        if issubclass(py_type, int):
            return IntConverter
        elif issubclass(py_type, str):
            return StrConverter
        else:
            raise TypeError('only str and int based enums supported')
        # end if
    # end def
    def __init__(self, provider, py_type, attr=None):
        super(EnumConverter, self).__init__(provider=provider, py_type=py_type, attr=attr)
        self.converter_class = self._get_real_converter(self.py_type)
        self.converter = self.converter_class(provider=self.provider, py_type=self.py_type, attr=self.attr)
    # end if
    def init(self, kwargs):
        self.converter.init(kwargs=kwargs)
    # end def
    def validate(self, val, obj=None):
        assert issubclass(self.py_type, Enum)
        assert issubclass(self.py_type, (int, str))
        return self.converter.validate(val=val, obj=obj)
    # end def
    def py2sql(self, val):
        return self.converter.py2sql(val=val)
    # end def
    def sql2py(self, val):
        return self.converter.sql2py(val=val)
    # end def
    def val2dbval(self, val, obj=None):
        """ passes on the value to the right converter """
        return self.converter.val2dbval(val=val, obj=obj)
    # end def
    def dbval2val(self, dbval, obj=None):
        """ passes on the value to the right converter """
        py_val = self.converter.dbval2val(self, dbval=dbval, obj=obj)
        if py_val is None:
            return None
        # end if
        return self.py_type(py_val)  # SomeEnum(123) => SomeEnum.SOMETHING
    # end def
    def dbvals_equal(self, x, y):
        self.converter.dbvals_equal(self, x=x, y=y)
    # end def
    def get_sql_type(self, attr=None):
        return self.converter.get_sql_type(attr=attr)
    # end def
    def get_fk_type(self, sql_type):
        return self.converter.get_fk_type(sql_type=sql_type)
    # end def
# end class
Possibly self._get_real_converter(self.py_type) can be replaced by simply using self.provider.get_converter_by_py_type(self.py_type).
It is just added above as an extra function to illustrate the working of it better.
Needed changes:
- The converter_classesneed to includeEnumConverteras well.- It must be one of the first elements there, so that the EnumConverterclass would always be chosen before any class implementing the underlying enum type, likeint.
- This needs to be done for all the database providers (pony.orm.dbproviders). Example forpony.orm.dbproviders.mysql:
 
- It must be one of the first elements there, so that the 
    converter_classes = [
+       (Enum, dbapiprovider.EnumConverter),
        (NoneType, dbapiprovider.NoneConverter),
        (bool, dbapiprovider.BoolConverter),
        (basestring, MySQLStrConverter),
        (int_types, dbapiprovider.IntConverter),
        (float, MySQLRealConverter),
        …
    ]
Implementation Approach B
Here the existing Converters would be modified to support enums.
In their dbval2val they would be modified to include the following code before returning the value:
if is subclass(self.py_type, Enum):
   return self.py_type(val)
return val
Implementation Decision
Probably A is the cleaner approach as that requires less places to modify (or forget to do so), and also works with all database-native types as well as future types as well.
Limitations
The special meaning a enum has, that is both having a name and making sure those values are whitelisted will not be reflected inside the database, it would only be enforced by the python side.
Only enums combining native types (or other types already supported by pony) can be used, as the value is stored, not the keys. This means no enums combining multiple types. The following would be impossible:
class MixedEnum(enum.Enum):
  FOO = None
  BAR = True
  BATZ = False
  MORE = 123
  STUFF = "ponies!"
  TO_ADD = datetime.now()
Sample Text
Any plans to release this soon? Also, I doesn't look like it supports Enums in select()
@matee911 You mean currently, or one of the planned approaches (A or B)?
I'd now change the _get_real_converter of EnumConverter to:
    def _get_real_converter(self, py_type):
        """
        Gets a converter for the underlying type.
        :return: Type[Converter]
        """
        for t, converter_cls in self.provider.converter_classes:
            if issubclass(t, EnumConverter):
                # skip our own type, otherwise this could get ugly
                continue
            # end if
            if issubclass(py_type, t):
                return converter_cls
            # end if
        # end for
        throw(TypeError, 'No database converter found for enum base type %s' % py_type)
    # end def
(no longer a @staticmethod)
That should allow to use all the types the database/pony supports.
I'll mirror the discussion we have in telegram here.
Alexander Kozlovsky, 31. Jan 2021:
[1.] I see the practical problem with enums in the fact that many programmers will want to change the list of values some time later. Some of them will change enum values in Python without understanding that the values in the database remain the same, and then their application will stop working correctly.
[2.] I have a look at #502 and #585. In general it looks good. I'm not ready to include it right now, need to think about it a bit more.
[3.] Currently Pony supports Python 2.7 as well, so it should work on 2.7 too.
[4.] On the other side, Pony does not depend on other packages, so I want to avoid specifying dependency to some backported version of Enum for Python 2.7
[5.] The methods _prepare_int_kwargs and _prepare_str_kwargs look too complicated to me. I'm not sure it is good to specify things like max_len looking at the list of specified values. Most users will want to extend the number of possible values later, and with specified max_len or overly restricted int size, it will not be possible to extend the list of elements without some database migration.
[6.] Some people expect from enum to support database enums, for example, for PostgreSQL. I think it is a different topic to add database enums support, but it is good to understand, is the proposed solution compatible with such future addition.
[7.] Will return to this PR on the next week
[Numbers added to easier reference them.]
Thanks for taking your time for looking into this. I'll respond to your paragraphs below, counting them from the top, starting at 1.
1) I think the fact that this is no database side enum but actually just the values in the database is a fact which just has to be documented clearly, really.
2) Thanks! I think that will really be helpful for working with enum values.
3+4) Okey, I'll check py2 that in detail.
By the way, how long is support for 2.7 planned to be kept? It went End-of-life  over a year ago, after all, so adding a backport (enum34 or aenum) just for those could still be a viable solution. An alternative I could look into would be making that enum import fail in a graceful way which would simply disable enum support on that old system. If you still use python 2.7, you're not used to enums anyway.
5) _prepare_*_kwargs:
I think of those as a convince method. If a user set's any of those values, those will be take into account properly.
Again as in 1, this will be documented in the documentation in detail, so that people who want an enum with a big size can simply set those values appropriately from the start.
As soon as we have a migration in place which can handle the change of the size attribute of a int, this should be possible to mirror into this as well.
But of course that one could always use the biggest int available and call it a day.
6+1) I still think that doesn't really matter for this approach too much. If database enums are added later it could be done in a transparent and forward compatible way. Either opt-in or opt-out. I am thinking Required(SomeEnum, database=True) or something similar (don't pin me on that variable name).
In fact, from reading up on enum types in postgres, I think it is compatible as long as you manually run a CREATE TYPE someenum AS ENUM ('sad', 'ok', 'happy'); and ALTER TABLE "foobar" ALTER some_column TYPE someenum USING some_column::someenum; to create the enum and change the column type of your table to use that manually.
The only thing this is missing is the automised approach, but for that we should probably do migrations first, and thus I would want to keep those topics separated.
[typos fixed]
Additionally I want to add, there was asked about a key based version of this, so an int enum would still store a string like FIRST_VALUE, and I think that could be easily integrated as a store_keys=True key (or whatever the name would end up being), if so desired it should be quite trivial to add as well.
luckydonald, 10. February 2021:
3+4) I had a bit of thought about py2 compatibility.
For that I would rewrite it to try to import std-lib Enum, and aenum, but if none of those are available disable that converter.
That way you'd either install aenums or be on a supported system, or you can't use it.
I mean I if there's no Enums in py2 one can simply use normal numbers instead. We don't need to support something which doesn't exist on py2.
5) Regarding _prepare_*_kwargs:
I how about rewriting it so that it does the needed checks (i.e. the given enum fits max_size), but that 'automatic' size finding would be opt-in?
So for example having something like automatic_size=True as parameter to turn on the magic or keep it off and it would simply use the biggest int64 and not set max and min for you.
6) Regarding database enums, Did you had some time to think about it more, if that would be a deal breaker?
[3+4] I think conditional import is ok
[5] think using int64 is not necessary, as all enums with explicitly defined values should fit to default int size.
The same for strings, I think default str without explicitly specified size should be ok most of the time. In SQLite and PostgreSQL strings may be of any size, and in MySQL it will be
VARCHAR(255)by default
automatic_sizelooks good, how about renaming it to a shorterauto_size?
@luckydonald I'm trying to use your fork. Works very nice with a sqlite-DB 👍 . With a mysql-DB I'm getting this exception:
 UnexpectedError: Object Artifact[new:1] cannot be stored in the database. DataError: 1366 Incorrect integer value: 'ProcessingState.NEW' for column 'processingstate' at row 1
is there a reason it shouldn't work against mysql? The traceback doesn't look useful to me, but I can share it of course if you think it is useful
@arian-f Good find. Sorry for not looking into it yet. Yes, please, if you still have it laying around or can reproduce it.
I would also especially be interested in the parts of the Pony model which breaks it, so basically a minimal code example making it go boom:boom: would be great!
- Pony model definition
- Enum definition
- Python code making the breaking database query
- Exception message and stack trace
Or considering how long it's been ago, anything you can remember about it 😅
Nice Idea @luckydonald, is there any ambition merging this? maybe @luckydonald could open an MR with the necessary changes and we can have a look whether the maintainer will have a blink on it
Hi all. Is there any movement on this?
I was bitten by this yesterday and it took me a while to figure out what was going on. For anyone else who Googles their way to this page, my workaround was to use pony.orm.raw_sql like so:
raw = [
    'x.some_col=="{}"'.format(MyEnum.some_attr.value),
    'x.some_other_col=="{}"'.format(MyEnum.some_other_attr.value),
]
res = orm.select(
    x for x in MyDbEntity if orm.raw_sql(
        " and ".join(raw)
    )
)
@tweigel-dev In fact has a PR ready (#585), and I did test it a bit as well. See the unit tests there. Maybe give it a go and say if it's working for you.
@austinjp not that I'm aware of, that PR is waiting for review.
Actually it would need a merge with the new changes, but hopefully that should still work.