mongo
mongo copied to clipboard
Active collection and active record patterns.
Traits:
-
Collection — "Active Collection" features for management, such as collection and index creation; minimal, encouraging use of
get_collection()
and PyMongo methods where possible. -
Queryable — "Active Collection" bridging to "Active Record" where such proxying serves the purpose of translation from simplified Marrow Mongo concepts (such as parametric interpretation) and is demonstrably useful. Less minimal.
Discuss.
I've had an idea. My concerns over implementing the active record pattern primarily revolved around the widespread impact such an implementation would have, nearly everywhere. Custom list
, dict
, &c. subclasses to track state, contamination of the Field
implementations, and so forth; it's quite a rabbit hole.
However, for a "phase 1" implementation of basic operations such effort is not necessarily required. No modification tracking of the individual complex internal structures (still largely using basic types or explicitly compatible Document for PyMongo compatibility) but only tracking assignment and deletion operations. This should be an accomplishable first milestone.
To that end, __class__
on an instance is mutable. A Document
mix-in (Active
) can be developed that will enumerate the fields assigned to the subclass being constructed and if not already mixed-in with an internal implementation (_Active
) attempt to identify a cached mixed-in version or generate one if missing, swapping __class__
on the field.
The document-level mix-in would need to implement a _pending
mapping of outstanding field-level operations to apply on .save()
.
This _Active
implementation class would need to cooperatively overload several operations:
-
__set__(self, obj, value)
— record an explicit$set
operation. -
__delete__(self, obj)
— record an explicit$unset
operation, or, with a default andassign
enabled,$set
the default.
Only the most recent "major" operation will apply, as only one can be performed within a given update.
Additional points from discussion today:
-
Some Python-side operations can not be trapped or detected. The trivial example is that of incrementing a user's age:
user.age += 1
resolves to the Python language opcodes orgetattr
,add
, andsetattr
, making the operation indistinguishable from simply assigning an exact integer value. (Marrow Mongo would only "see" the retrieval and assignment during__get__
and__set__
. Attempting to heuristically infer the operation applied by comparison against the original value is a hard nope.) -
Methods utilizing parametric operations should be provided which apply the operation to the Python-side representation such as
.update(inc__age=1)
, and enqueue the appropriate MongoDB-side operation for application when invoking.save()
. -
Multiple operations should be combined where possible. Following from the above examples, if we initially assign an age with
user.age = 20
, do not save, then invokeuser.update(inc__age=1)
, a singular pending operation should exist as a$set
whose value is21
, anduser.age == 21
should hold true. (This would be the case if you were to perform the operation Python-side usinguser.age += 1
.)
Factory tested in small-scale, it works: [edit to compactify and handle the edge case of one already mixed-in]
class CachedMixinFactory(dict):
def __missing__(self, cls):
if issubclass(cls, _ActiveField): return cls
new_class = cls.__class__('Active' + cls.__name__, (cls, _ActiveField), {})
self[cls] = new_class
return new_class
Now to test the example implementation… 👹
class _Active(Document):
__field_cache = CachedMixinFactory()
def __attributed__(self):
"""Automatically mix active behaviours into field instances used during declarative construction."""
for name, field in self.__attributes__:
field.__class__ = self.__field_cache[field]