[bug] Performance issues with nested `PacketField`s
Brief description
When PacketFields are nested on several stages,
Scapy races exponential performance issues.
Scapy version
2.6.1
Python version
3.8.6
Operating system
Windows
Additional environment information
No response
How to reproduce
The following .uts test demonstrates the issue:
% Packet field performance issue
############
############
+ Test utils
= Prepare a `check_exec_time()` function.
def check_exec_time(description, t0, tf, max):
print(f"{description}: {tf - t0:.3f} seconds")
assert tf - t0 <= max, f"{description!r} FAILED: {tf - t0:.3f} > {max:.3f}"
############
############
+ Test data
= Define the `F` final packet class.
class F(Packet):
fields_desc = [
ByteField(name="value", default=0),
]
def extract_padding(self, s):
return (b'', s)
= Define the `I` intermediate packet class.
class I(Packet):
NUMBER_OF_F_PER_I = 100
fields_desc = [
PacketField(name=f"f{i}", pkt_cls=F, default=F(value=i))
for i in range(NUMBER_OF_F_PER_I)
]
def extract_padding(self, s):
return (b'', s)
= Define the `M` main packet class.
class M(Packet):
NUMBER_OF_I_PER_M = 100
fields_desc = [
PacketField(name=f"i{i}", pkt_cls=I, default=I())
for i in range(NUMBER_OF_I_PER_M)
]
= Define `MAX_INSTANTIATION_TIME_EXPECTED`.
MAX_INSTANTIATION_TIME_EXPECTED = 1.0
= Define `MAX_SERIALIZATION_TIME_EXPECTED`.
MAX_SERIALIZATION_TIME_EXPECTED = 0.250
= Define `MAX_PARSING_TIME_EXPECTED`.
MAX_PARSING_TIME_EXPECTED = 0.250
############
############
+ Default instantiations
= Build a default instance of `F`.
t0 = time.time()
f = F()
tf = time.time()
check_exec_time("Default instantiation of `F`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
= Build a default instance of `F` again.
t0 = time.time()
f = F()
tf = time.time()
check_exec_time("Default instantiation of `F`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
= Build a default instance of `I`.
t0 = time.time()
i = I()
tf = time.time()
check_exec_time("Default instantiation of `I`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
= Build a default instance of `B` again.
t0 = time.time()
i = I()
tf = time.time()
check_exec_time("Default instantiation of `I`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
= Build a default instance of `M`.
t0 = time.time()
m = M()
tf = time.time()
check_exec_time("Default instantiation of `M`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
= Build a default instance of `M` again.
t0 = time.time()
m = M()
tf = time.time()
check_exec_time("Default instantiation of `M`", t0, tf, MAX_INSTANTIATION_TIME_EXPECTED)
############
############
+ Serialization
= Launch serialization from the latest instance of `M` created.
t0 = time.time()
buf = m.build()
tf = time.time()
check_exec_time("Serialization of `M`", t0, tf, MAX_SERIALIZATION_TIME_EXPECTED)
############
############
+ Parsing
= Parse the buffer serialized before with the latest instance of `M` created.
t0 = time.time()
m.dissect(buf)
tf = time.time()
check_exec_time("Parsing of `M`", t0, tf, MAX_PARSING_TIME_EXPECTED)
Actual result
No response
Expected result
No response
Related resources
No response
Possibly related to #3894?
Analysis
Let N the number of packet fields at a given stage. Let D (depth) the number of stages of nested packet fields. The expected complexity for this kind of packets should be O(N ^ D).
As far as I've noticed, this is due to useless copies of packets during initializations, parsing and serialization operations.
Let K the factor of time lost induced by these useless instantiations. It leads to the following complexity degradation: O((K.N) ^ (K.D)).
Successfully fixed in a Scapy wrapper.
I'll provide a PR soon to propose the integration of it directly in Scapy implementation.
Notes:
- This fix makes cache being set on a packet which
build()is called onto. - Consequently, #4706 should be taken into account with this fix.
Thanks for the report ! And thank you for the future PR.
@gpotter2 Draft-PR #4727 initiated. Could you please send me your remarks on it? so that I can make it clean. Thanks.