using dataclasses
Hi,
I started of my own version of this, but found yours basically the last moment … My name of choice was objset, but the poll is finished I guess.
I used dataclasses - ported here. And updated the fields to match current OpenZFS git.
I made a similar effort to use dataclasses shortly after 2.0's release, but decided against it because it reduced performance for a pretty arguable code improvement. Your approach seems to perform even worse than mine, perhaps because of the use of reflection:
Benchmark 1: python3.11 ./ioztat rpool 0.000001 500 >/dev/null
Time (mean ± σ): 10.006 s ± 0.063 s [User: 8.456 s, System: 3.738 s]
Range (min … max): 9.895 s … 10.097 s 10 runs
Benchmark 2: python3.11 /usr/local/bin/ioztat rpool 0.000001 500
Time (mean ± σ): 5.616 s ± 0.042 s [User: 4.184 s, System: 3.642 s]
Range (min … max): 5.543 s … 5.678 s 10 runs
Summary
'python3.11 /usr/local/bin/ioztat rpool 0.000001 500' ran
1.78 ± 0.02 times faster than 'python3.11 ./ioztat rpool 0.000001 500 >/dev/null'
Missing slots - I did not consider this to be optimized for performance.
But nice comparison - what did you use for the numbers?
Results are generated with hyperfine.
I tried __slots__ on baseline and found no measurable difference.
Dropping reflection and looking on the raw performance to create an object and access the properties for normal/slots/dataclasses:
import dataclasses
import timeit
import datetime
import copy
import typing
class Dataset:
"""
ZFS dataset statistics over a timespan in seconds
"""
name = ''
reads = 0
nread = 0
writes = 0
nwritten = 0
nunlinks = 0
nunlinked = 0
timespan = 0
def __init__(self, name=''):
self.name = name
@classmethod
def from_dict(cls, data):
d = cls(data['dataset_name'])
d.reads = int(data['reads'])
d.nread = int(data['nread'])
d.writes = int(data['writes'])
d.nwritten = int(data['nwritten'])
d.nunlinks = int(data.get('nunlinks', 0))
d.nunlinked = int(data.get('nunlinked', 0))
d.timespan = int(data['timestamp']) / 1e9
return d
def __add__(self, other):
"""
Return a new Dataset with this one's values added to other
Note timespan is set to the maximum of the two values instead of being
added.
"""
d = copy.copy(self)
# d = Dataset(self.name)
d.reads += other.reads
d.nread += other.nread
d.writes += other.writes
d.nwritten += other.nwritten
d.nunlinks += other.nunlinks
d.nunlinked += other.nunlinked
d.timespan = max(d.timespan, other.timespan)
return d
class OptimizedDataset:
"""
ZFS dataset statistics over a timespan in seconds
"""
def __init__(self, name='', reads=0, nread=0, writes=0, nwritten=0, nunlinks=0, nunlinked=0, timespan=0):
self.name = name
self.reads = reads
self.nread = nread
self.writes = writes
self.nwritten = nwritten
self.nunlinks = nunlinks
self.nunlinked = nunlinked
self.timespan = timespan
@classmethod
def from_dict(cls, data):
d = cls(data['dataset_name'],
int(data['reads']),
int(data['nread']),
int(data['writes']),
int(data['nwritten']),
int(data.get('nunlinks', 0)),
int(data.get('nunlinked', 0)),
int(data['timestamp']) / 1e9)
return d
def __add__(self, other):
"""
Return a new Dataset with this one's values added to other
Note timespan is set to the maximum of the two values instead of being
added.
"""
d = OptimizedDataset(self.name,
self.reads + other.reads,
self.nread + other.nread,
self.writes + other.writes,
self.nwritten + other.nwritten,
self.nunlinks + other.nunlinks,
self.nunlinked + other.nunlinked,
max(self.timespan, other.timespan))
return d
class SlotsDataset:#(Dataset):
__slots__ = ["name","reads","nread","writes","nwritten","nunlinks","nunlinked","timespan"]
"""
ZFS dataset statistics over a timespan in seconds
"""
def __init__(self, name='', reads=0, nread=0, writes=0, nwritten=0, nunlinks=0, nunlinked=0, timespan=0):
self.name = name
self.reads = reads
self.nread = nread
self.writes = writes
self.nwritten = nwritten
self.nunlinks = nunlinks
self.nunlinked = nunlinked
self.timespan = timespan
@classmethod
def from_dict(cls, data):
d = cls(data['dataset_name'],
int(data['reads']),
int(data['nread']),
int(data['writes']),
int(data['nwritten']),
int(data.get('nunlinks', 0)),
int(data.get('nunlinked', 0)),
int(data['timestamp']) / 1e9)
return d
def __add__(self, other):
"""
Return a new Dataset with this one's values added to other
Note timespan is set to the maximum of the two values instead of being
added.
"""
d = SlotsDataset(self.name,
self.reads + other.reads,
self.nread + other.nread,
self.writes + other.writes,
self.nwritten + other.nwritten,
self.nunlinks + other.nunlinks,
self.nunlinked + other.nunlinked,
max(self.timespan, other.timespan))
return d
@dataclasses.dataclass #(slots=True)
class DataclassDataset:
"""
ZFS dataset statistics over a timespan in seconds
The Dataset fields are "documented" here:
https://github.com/openzfs/zfs/blob/9681de4657686d0ed19ca18d578513e74395f00f/module/zfs/dataset_kstats.c#L32
"""
timestamp: int = 0
dataset_name: str = ''
writes: int = 0
nwritten: int = 0
reads: int = 0
nread: int = 0
nunlinks: int = 0
nunlinked: int = 0
timespan: int = 0
@classmethod
def from_dict(cls, data):
d = cls(int(data['timestamp']),
data['dataset_name'],
int(data['reads']),
int(data['nread']),
int(data['writes']),
int(data['nwritten']),
int(data.get('nunlinks', 0)),
int(data.get('nunlinked', 0)),
int(data['timestamp']) / 1e9)
return d
def __add__(self, other):
"""
Return a new Dataset with this one's values added to other
Note timespan is set to the maximum of the two values instead of being
added.
"""
d = DataclassDataset(self.timestamp, self.dataset_name,
self.reads + other.reads,
self.nread + other.nread,
self.writes + other.writes,
self.nwritten + other.nwritten,
self.nunlinks + other.nunlinks,
self.nunlinked + other.nunlinked,
self.timespan + other.timespan )
return d
@dataclasses.dataclass(slots=True)
class DataclassSlotsDataset:
"""
ZFS dataset statistics over a timespan in seconds
The Dataset fields are "documented" here:
https://github.com/openzfs/zfs/blob/9681de4657686d0ed19ca18d578513e74395f00f/module/zfs/dataset_kstats.c#L32
"""
timestamp: int = 0
dataset_name: str = ''
writes: int = 0
nwritten: int = 0
reads: int = 0
nread: int = 0
nunlinks: int = 0
nunlinked: int = 0
timespan: int = 0
@classmethod
def from_dict(cls, data):
d = cls(int(data['timestamp']),
data['dataset_name'],
int(data['reads']),
int(data['nread']),
int(data['writes']),
int(data['nwritten']),
int(data.get('nunlinks', 0)),
int(data.get('nunlinked', 0)),
int(data['timestamp']) / 1e9)
return d
def __add__(self, other):
"""
Return a new Dataset with this one's values added to other
Note timespan is set to the maximum of the two values instead of being
added.
"""
d = DataclassSlotsDataset(self.timestamp, self.dataset_name,
self.reads + other.reads,
self.nread + other.nread,
self.writes + other.writes,
self.nwritten + other.nwritten,
self.nunlinks + other.nunlinks,
self.nunlinked + other.nunlinked,
self.timespan + other.timespan )
return d
def main():
data = {i:idx for idx,i in enumerate(SlotsDataset.__slots__)}
data["dataset_name"] = "test"
data["timestamp"] = datetime.datetime.now().timestamp()
del data["name"]
del data["timespan"]
cnt = 1000000
print("create")
print(timeit.timeit(lambda: Dataset.from_dict(data), number=cnt))
print(timeit.timeit(lambda: OptimizedDataset.from_dict(data), number=cnt))
print(timeit.timeit(lambda: SlotsDataset.from_dict(data), number=cnt))
print(timeit.timeit(lambda: DataclassDataset.from_dict(data), number=cnt))
print(timeit.timeit(lambda: DataclassSlotsDataset.from_dict(data), number=cnt))
print("add")
v = Dataset.from_dict(data)
print(timeit.timeit(lambda: v+v, number=cnt))
v = OptimizedDataset.from_dict(data)
print(timeit.timeit(lambda: v+v, number=cnt))
v = SlotsDataset.from_dict(data)
print(timeit.timeit(lambda: v+v, number=cnt))
v = DataclassDataset.from_dict(data)
print(timeit.timeit(lambda: v+v, number=cnt))
v = DataclassSlotsDataset.from_dict(data)
print(timeit.timeit(lambda: v+v, number=cnt))
if __name__ == "__main__":
main()
Dataset OptimizedDataset (no copy.copy, no class parameters) SlotsDataset DataclassDataset DatalcassSlotsDataset
create
0.8905068870517425
0.8984799759928137
0.8242627109866589
1.0172464569914155
0.9374575979891233
add
2.3710521850152873
0.7287906949641183
0.629768434970174
0.9442946459748782
0.5963006669771858
python 3.10 dataclasses with slots=True outperforms the alternatives when doing the first __add__
__slots__ is second, normal classes third.
I think order (combined init & add) would be along:
- DataclassSlotsDataset
- SlotsDataset
- OptimizedDataset
- DataclassDataset
- Dataset
Our Python MSV is 3.7, so Dataclass slots is a non-starter for now.
These long positional argument lists are quite nasty - they're begging for an easy-to-miss mixup and I'd rather avoid them.
I added versions with keyword arguments, plus a version that avoids copy.copy by direct field access, and that plus __slots__. I tested them across the range of supported versions:
3.7
create
Baseline 4.779117083991878
NoCopy 4.819048156961799
NoCopy/Slots 4.610547592048533
Optimized 4.653545118984766
Optimized/KW 5.483340958948247
Slots 4.283310921047814
Slots/KW 4.905152444029227
Dataclass 4.807075573946349
Dataclass/KW 5.597749298904091
add
Baseline 12.492388554033823
NoCopy 3.6168077409965917
NoCopy/Slots 3.1257432530401275
Optimized 3.5399633079068735
Optimized/KW 4.302096275961958
Slots 2.7610619330080226
Slots/KW 3.4788211649283767
Dataclass 3.5082433730131015
Dataclass/KW 4.293326015933417
3.9
create
Baseline 3.406673535006121
NoCopy 3.4032381870783865
NoCopy/Slots 3.3954955079825595
Optimized 3.4104782280046493
Optimized/KW 4.28153379203286
Slots 2.9133199750212952
Slots/KW 3.9084081649780273
Dataclass 3.4400185080012307
Dataclass/KW 4.317347781965509
add
Baseline 11.0603123200126
NoCopy 3.367319536046125
NoCopy/Slots 3.115666083060205
Optimized 3.4051882980857044
Optimized/KW 4.158022553892806
Slots 2.748979698983021
Slots/KW 3.56267236196436
Dataclass 3.41977860708721
Dataclass/KW 4.2807325079338625
3.10
create
Baseline 3.440365262911655
NoCopy 3.4241489080013707
NoCopy/Slots 3.343199662049301
Optimized 3.438087990041822
Optimized/KW 4.328920533065684
Slots 2.9351572120795026
Slots/KW 3.88322276098188
Dataclass 3.4257506920257583
Dataclass/KW 4.289895735913888
add
Baseline 11.33392396196723
NoCopy 3.821642320952378
NoCopy/Slots 2.6333876519929618
Optimized 3.0709942990215495
Optimized/KW 4.024556623073295
Slots 2.2230605570366606
Slots/KW 3.119670856045559
Dataclass 3.8814144630450755
Dataclass/KW 4.832975470926613
3.11
create
Baseline 2.230551384971477
NoCopy 2.233327505993657
NoCopy/Slots 2.2825138609623536
Optimized 2.2936561700189486
Optimized/KW 3.253970602992922
Slots 2.258040190907195
Slots/KW 3.156391634955071
Dataclass 2.301062963088043
Dataclass/KW 3.221311755012721
add
Baseline 10.089828453958035
NoCopy 1.2415083339437842
NoCopy/Slots 1.3665269260527566
Optimized 1.433199180988595
Optimized/KW 2.2370554810622707
Slots 1.4051770099904388
Slots/KW 2.24045382195618
Dataclass 1.3950953249586746
Dataclass/KW 2.2120107349473983
NoCopy/Slots looks like the best of the bunch. It's slightly slower than NoCopy on 3.11, but not by much, it beats all the keyword argument versions, and while some positional argument versions beat it, not by enough to justify themselves.
Also, damn my CPU sucks.
You've been competing with AMD Ryzen 5 PRO 4650U with Radeon Graphics.
I've updated the PR to use NoCopy/Slots.
CPU: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz (2133.53-MHz K8-class CPU)
They're teenagers now. I knew they sucked, but I didn't quite expect a similarly-clocked 15W laptop part to beat them by that margin.
https://www.cpubenchmark.net/compare/1983vs4272vs3547vs4827/Intel-Xeon-L5639-vs-AMD-Ryzen-9-5900-vs-AMD-EPYC-7742-vs-AMD-Ryzen-5-PRO-6650U
due to clock - the numbers for 7742 (ubuntu 23.04/py3.10) are 10-14% worse than my ryzen mobile:
create
Baseline 1.0522814639998614
NoCopy 1.0588048629997502
NoCopy/Slots 1.1171760289998929
Optimized 1.136160388999997
Optimized/KW 1.3660570930001086
Slots 1.0198019910003495
Slots/KW 1.2555289689998972
Dataclass 1.1098613029998887
Dataclass/KW 1.3511358109999492
add
Baseline 2.889791399999922
NoCopy 1.0617886190002537
NoCopy/Slots 0.9156696780000857
Optimized 0.9038795599999503
Optimized/KW 1.1309551809999903
Slots 0.7515657260000808
Slots/KW 0.9777399929998865
Dataclass 1.1047035159999723
Dataclass/KW 1.3352872320001552
And - there is a new revision available.
I've dropped kwargs from Dataset.init alltogether, they are not used and degrade performance.
def __init__(self, name, /): would even be better, but is python3.8+.
Looking like a 5-10% performance uplift on my hardware, plus a 0.5-2% drop in memory use.