dd-trace-py
dd-trace-py copied to clipboard
is DD_PROFILING_ALLOC_EVENT_BUFFER to small?
Summary of problem
is DD_PROFILING_ALLOC_EVENT_BUFFER to small?
Which version of dd-trace-py are you using?
i use datadog to collect allocation sample.
i find the output is very different when i set DD_PROFILING_ALLOC_EVENT_BUFFER to 10000.
Which libraries and their versions are you using?
ddtrace-run -v ddtrace-run 2.5.0
attrs==22.2.0
bytecode==0.13.0
cattrs==1.0.0
chardet==3.0.4
cloudpickle==2.2.1
commonmark==0.9.1
conda==4.6.14
dataclasses==0.8
ddsketch==2.0.4
ddtrace==1.20.16
distro==1.5.0
envier==0.4.0
idna==2.10
importlib-metadata==4.8.3
ipaddress==1.0.23
numpy==1.19.5
pep562==1.1
Pillow==8.4.0
protobuf==3.19.6
pycosat==0.6.3
pycrypto==2.6.1
Pygments==2.14.0
PySocks==1.6.8
PyYAML==3.13
requests==2.14.2
rich==12.6.0
ruamel.yaml==0.13.14
six==1.16.0
typing_extensions==4.1.1
urllib3==1.25.6
xmltodict==0.13.0
zipp==3.6.0
How can we reproduce your problem?
code:
#from ddtrace.profiling import Profiler
#prof = Profiler(
# url="http://127.0.0.1:8081",
#enable_code_provenance=True, # provide library version
#)
#prof.start() # Should be as early as possible, eg before other imports, to ensure everything is profiled
import os
import PIL
from PIL import Image
import time
dir = "/home/centos/pillow-bench/Images_0/0"
onlyfiles = []
filedir = dir
files = [f for f in os.listdir(filedir) if os.path.isfile(os.path.join(filedir, f))]
filepaths = map( lambda x: os.path.join(filedir, x), files )
onlyfiles += list(filepaths)
images = []
s = 0
for i in range(len(onlyfiles)):
fname = onlyfiles[i]
image = PIL.Image.open(fname)
# Do something
images.append(image)
image.tobytes("xbm", "rgb")
if i / 10000 == 1:
print('iteration ', i)
break
#time.sleep(30
#input("Press Enter to continue...")
first run:
DD_PROFILING_ENABLED=true DD_PROFILING_STACK_ENABLED=false DD_PROFILING_LOCK_ENABLED=false DD_PROFILING_CAPTURE_PCT=100 DD_PROFILING_OUTPUT_PPROF=h.pprof DD_PROFILING_MEMORY_EVENTS_BUFFER=16 ddtrace-run python3 benchmakr.py
second run:
DD_PROFILING_ENABLED=true DD_PROFILING_STACK_ENABLED=false DD_PROFILING_LOCK_ENABLED=false DD_PROFILING_CAPTURE_PCT=100 DD_PROFILING_OUTPUT_PPROF=h.pprof DD_PROFILING_MEMORY_EVENTS_BUFFER=16 ddtrace-run python3 benchmakr.py
What is the result that you get?
when buffer is 16:
alloc-space
alloc-samples
heap-space
when buffer is 10000:
alloc-space
alloc-samples
heap-space
What is the result that you expected?
i noticed that the allocation sample would cleanup each interval(default 0.5s), if the allocation sample is larger than it, it would be random sampled.
the allocation space would be not accuracy with small allocation buffer.
when i read the code.
i find the calculation of sample event and sample space is different.
with small event buffer, we have small capture_pct.
finally.
sample allocation size equals event.size * alloccount / count, but the alloccount is same. so if count is small, the larger heap size. if each event.size is same, then the final heap size is correct. but if sample event size is large, then the final heap size would be too larger.
the sample allocation sample equals with count. so with larger buffer, it would be larger.
the profiling default recorder max events is 16384.
so the alloc_buffer_size / 0.5 * 60s = 16384. the alloc_buffer_size would be safely set to 128.
i also noticed that all event in same allocation array, even they have different stacktrace.
they have the same count, this would make the finally result count x of the really output.
使用 tracemalloc 获取的实际内存分配次数以及 bytes:
python3 benchmakr.py
iteration 10000
[ Top 10 ]
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:273: size=12.3 MiB, count=49959, average=259 B
<frozen posixpath>:88: size=10.0 MiB, count=108131, average=97 B
/home/centos/pillow-bench/benchmakr.py:20: size=7241 KiB, count=108132, average=69 B
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:840: size=2969 KiB, count=20002, average=152 B
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:250: size=2419 KiB, count=39957, average=62 B
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:77: size=1171 KiB, count=9992, average=120 B
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:72: size=1168 KiB, count=9963, average=120 B
/root/miniconda3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:73: size=859 KiB, count=20002, average=44 B
/home/centos/pillow-bench/benchmakr.py:22: size=845 KiB, count=1, average=845 KiB
<frozen importlib._bootstrap_external>:729: size=699 KiB, count=6420, average=111 B
Total allocation count is 537063
code:
#from ddtrace.profiling import Profiler
#prof = Profiler(
# url="http://127.0.0.1:8081",
#enable_code_provenance=True, # provide library version
#)
#prof.start() # Should be as early as possible, eg before other imports, to ensure everything is profiled
import os
import PIL
from PIL import Image
import time
dir = "/home/centos/pillow-bench/Images_0/0"
onlyfiles = []
import tracemalloc
tracemalloc.start()
filedir = dir
files = [f for f in os.listdir(filedir) if os.path.isfile(os.path.join(filedir, f))]
filepaths = map( lambda x: os.path.join(filedir, x), files )
onlyfiles += list(filepaths)
images = []
s = 0
for i in range(len(onlyfiles)):
fname = onlyfiles[i]
#print(fname)
image = PIL.Image.open(fname)
# Do something
images.append(image)
image.tobytes("xbm", "rgb")
if i / 10000 == 1:
print('iteration ', i)
break
#time.sleep(30
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 ]")
for stat in top_stats[:10]:
print(stat)
total_count = sum(stat.count for stat in top_stats)
print(f"Total allocation count is {total_count}")
Thanks for the report, @zdyj3170101136. Do you think it's possible that your application requires a larger buffer than the default? Maybe there's some documentation we could change to clarify this point?
cc @sanchda
感谢您的报告,@zdyj3170101136。您认为您的应用程序可能需要比默认值更大的缓冲区吗?也许我们可以更改一些文档来澄清这一点?
抄送@sanchda
sure, maybe the event_buffer dynamic set from reporter buffer and flush interval.
This issue has been automatically closed after a period of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.