default_factory on field called every time the datafile class changes.
This seems like very inefficient and weird behavior. I have a datafile class with a member that has a field(default_factory=_somefunction)
pretty much every time I touch an object of my datafile class, the default_factory is then again called twice.
from datafiles import datafile
from dataclasses import dataclass, field
from typing import Optional, Type
from rich.console import Console
cs = Console()
def _df():
cs.print("_df called")
return "Something"
@datafile("./test.yml")
class Stats:
memory: int = field(default_factory=_df)
memory_peak: int = field(default=0)
current = 2
peak = 12
cs.print(f"Current Memory: {current} Peak Memory: {peak}")
s = Stats(
memory=current,
)
cs.print("Stats created")
s.memory_peak = peak
cs.print("peak reset")
s.memory = current
cs.print("memory reset")
cs.print(s)
Here is the output. You can see _df called multiple times
└─> python df.py
Current Memory: 2 Peak Memory: 12
_df called
Stats created
_df called
_df called
peak reset
_df called
_df called
memory reset
Stats(memory=2, memory_peak=12)
Thanks for raising this! If we enable logging by adding these two lines to your example:
import logging
logging.basicConfig(level=logging.INFO)
Then we get this output:
Current Memory: 2 Peak Memory: 12
INFO:datafiles.mapper:Loading 'Stats' object from 'test.yml'
_df called
Stats created
INFO:datafiles.mapper:Saving 'Stats' object to 'test.yml'
_df called
_df called
peak reset
INFO:datafiles.mapper:Saving 'Stats' object to 'test.yml'
_df called
_df called
memory reset
Stats(memory=2, memory_peak=12)
I think one of those default factory calls is expected as the object is reconstructed during the roundtrip to the filesystem, but I can look into any further optimizations here.
This version should make fewer calls to the default factory: https://pypi.org/project/datafiles/2.2.3/