A lot of threads and performance hits for executing a few notebooks simultaneously
When I run a notebooks using nbclient.NotebookClient().execute(), each spawns a lot of threads (137), top/bpytop/htop shows that all cores are at 100% work, and ssh grows noticeably laggy when typing. This is on an AMD Ryzen Threadripper with 32 cores, running a maximum of eight simultaneous notebooks. None of the notebooks include any multiprocessing, threading, or async; it's a lot of work, but its all numpy, scipy, etc. I AM writing to a file, if that matters (the process adds a logging.FileHandler to the root logger and pickles some output to a file).
I don't necessarily know much about how to troubleshoot this kind of a problem, so I'm just going to start by sharing assumptions I have that people with more experience can correct. I thought that:
- non-interactive python will only ever take up around 100% of CPU because it is single-threaded, and therefore can only execute on a single core.
- if some python code would normally take up around 100% of CPU, running it as a notebook would only take up a small amount of overhead on other cores
- The difference in overhead for executing via nbclient should be less than or equal to the overhead when running a notebook interactively, because nbclient is less (not at all) interactive.
I've seen a few jupyter-themed questions about threads, e.g., but none that I found use nbclient. I'm just trying to capture the text and image output of a block of code to a formatted html file, for which I'm using nbconvert.exporters.HTMLExporter and nbconvert.writers.files.FilesWriter. If there's a different way to get that result using the jupyter ecosystem, please let me know.
Each kernel uses several threads because of zmq sockets, but I don't know if that could explain what you are seeing.
Here's the output of pstree -pacts 3211769 (the process I originally started). nbclient is launched through a python entry point/console script mitosis, which checks that the repository is on a clean commit, then builds, runs, and converts to HTML a notebook based on the arguments.
It looks like NBClient launches a kernel, and ZMQ accounts for four threads of it. And the four more ZMQ threads in the original process are, perhaps, a server?
pstree -pacts 3211769
systemd,1 --system --deserialize 32 splash
└─mitosis,3211769,jmsh /home/jmsh/github/Kalman-SINDy-paper/env/bin/mitosis
├─git,3212420 cat-file --batch-check
├─python3,3212437 -m ipykernel_launcher -f /tmp/tmpb8q4zxd3.json--Hist
│ ├─{ZMQbg/IO/0},3212453
│ ├─{ZMQbg/IO/0},3212457
│ ├─{ZMQbg/Reaper},3212452
│ ├─{ZMQbg/Reaper},3212456
│ ├─{python3},3212454
│ ├─{python3},3212455
│ ├─{python3},3212458
│ ├─{python3},3212459
│ ├─{python3},3212460
│ ├─{python3},3212468
│ ├─{python3},3212497
│ ├─{python3},3212498
│ ├─{python3},3212499
│ ├─{python3},3212500
│ ├─{python3},3212501
│ ├─{python3},3212502
│ ├─{python3},3212503
│ ├─{python3},3212504
│ ├─{python3},3212505
│ ├─{python3},3212506
│ ├─{python3},3212507
│ ├─{python3},3212508
│ ├─{python3},3212509
│ ├─{python3},3212510
│ ├─{python3},3212511
│ ├─{python3},3212512
│ ├─{python3},3212513
│ ├─{python3},3212514
│ ├─{python3},3212515
│ ├─{python3},3212516
│ ├─{python3},3212517
│ ├─{python3},3212518
│ ├─{python3},3212519
│ ├─{python3},3212520
│ ├─{python3},3212521
│ ├─{python3},3212522
│ ├─{python3},3212523
│ ├─{python3},3212524
│ ├─{python3},3212525
│ ├─{python3},3212526
│ ├─{python3},3212527
│ ├─{python3},3212528
│ ├─{python3},3212529
│ ├─{python3},3212530
│ ├─{python3},3212531
│ ├─{python3},3212532
│ ├─{python3},3212533
│ ├─{python3},3212534
│ ├─{python3},3212535
│ ├─{python3},3212536
│ ├─{python3},3212537
│ ├─{python3},3212538
│ ├─{python3},3212539
│ ├─{python3},3212540
│ ├─{python3},3212541
│ ├─{python3},3212542
│ ├─{python3},3212543
│ ├─{python3},3212544
│ ├─{python3},3212545
│ ├─{python3},3212546
│ ├─{python3},3212547
│ ├─{python3},3212548
│ ├─{python3},3212549
│ ├─{python3},3212550
│ ├─{python3},3212551
│ ├─{python3},3212552
│ ├─{python3},3212553
│ ├─{python3},3212554
│ ├─{python3},3212555
│ ├─{python3},3212556
│ ├─{python3},3212557
│ ├─{python3},3212558
│ ├─{python3},3212559
│ ├─{python3},3212836
│ ├─{python3},3212837
│ ├─{python3},3212838
│ ├─{python3},3212839
│ ├─{python3},3212840
│ ├─{python3},3212841
│ ├─{python3},3212842
│ ├─{python3},3212843
│ ├─{python3},3212844
│ ├─{python3},3212845
│ ├─{python3},3212846
│ ├─{python3},3212847
│ ├─{python3},3212848
│ ├─{python3},3212849
│ ├─{python3},3212850
│ ├─{python3},3212851
│ ├─{python3},3212852
│ ├─{python3},3212853
│ ├─{python3},3212854
│ ├─{python3},3212855
│ ├─{python3},3212856
│ ├─{python3},3212857
│ ├─{python3},3212858
│ ├─{python3},3212859
│ ├─{python3},3212860
│ ├─{python3},3212861
│ ├─{python3},3212862
│ ├─{python3},3212863
│ ├─{python3},3212864
│ ├─{python3},3212865
│ ├─{python3},3212866
│ ├─{python3},3212867
│ ├─{python3},3212868
│ ├─{python3},3212869
│ ├─{python3},3212870
│ ├─{python3},3212871
│ ├─{python3},3212872
│ ├─{python3},3212873
│ ├─{python3},3212874
│ ├─{python3},3212875
│ ├─{python3},3212876
│ ├─{python3},3212877
│ ├─{python3},3212878
│ ├─{python3},3212879
│ ├─{python3},3212880
│ ├─{python3},3212881
│ ├─{python3},3212882
│ ├─{python3},3212883
│ ├─{python3},3212884
│ ├─{python3},3212885
│ ├─{python3},3212886
│ ├─{python3},3212887
│ ├─{python3},3212888
│ ├─{python3},3212889
│ ├─{python3},3212890
│ ├─{python3},3212891
│ ├─{python3},3212892
│ ├─{python3},3212893
│ ├─{python3},3212894
│ ├─{python3},3212895
│ ├─{python3},3212896
│ ├─{python3},3212897
│ └─{python3},3212898
├─{ZMQbg/IO/0},3212439
├─{ZMQbg/IO/0},3212442
├─{ZMQbg/Reaper},3212438
├─{ZMQbg/Reaper},3212441
├─{mitosis},3211774
├─{mitosis},3211777
├─{mitosis},3211778
├─{mitosis},3211781
├─{mitosis},3211782
├─{mitosis},3211785
├─{mitosis},3211787
├─{mitosis},3211789
├─{mitosis},3211790
├─{mitosis},3211793
├─{mitosis},3211795
├─{mitosis},3211797
├─{mitosis},3211799
├─{mitosis},3211800
├─{mitosis},3211802
├─{mitosis},3211804
├─{mitosis},3211806
├─{mitosis},3211808
├─{mitosis},3211810
├─{mitosis},3211811
├─{mitosis},3211813
├─{mitosis},3211815
├─{mitosis},3211817
├─{mitosis},3211819
├─{mitosis},3211822
├─{mitosis},3211825
├─{mitosis},3211827
├─{mitosis},3211830
├─{mitosis},3211833
├─{mitosis},3211837
├─{mitosis},3211840
├─{mitosis},3211842
├─{mitosis},3211845
├─{mitosis},3211851
├─{mitosis},3211852
├─{mitosis},3211854
├─{mitosis},3211860
├─{mitosis},3211861
├─{mitosis},3211863
├─{mitosis},3211865
├─{mitosis},3211870
├─{mitosis},3211872
├─{mitosis},3211875
├─{mitosis},3211880
├─{mitosis},3211881
├─{mitosis},3211883
├─{mitosis},3211886
├─{mitosis},3211889
├─{mitosis},3211892
├─{mitosis},3211894
├─{mitosis},3211897
├─{mitosis},3211898
├─{mitosis},3211902
├─{mitosis},3211905
├─{mitosis},3211907
├─{mitosis},3211909
├─{mitosis},3211913
├─{mitosis},3211915
├─{mitosis},3211916
├─{mitosis},3211918
├─{mitosis},3211919
├─{mitosis},3211960
├─{mitosis},3211965
├─{mitosis},3212101
├─{mitosis},3212102
├─{mitosis},3212103
├─{mitosis},3212104
├─{mitosis},3212105
├─{mitosis},3212106
├─{mitosis},3212107
├─{mitosis},3212108
├─{mitosis},3212109
├─{mitosis},3212110
├─{mitosis},3212111
├─{mitosis},3212112
├─{mitosis},3212113
├─{mitosis},3212114
├─{mitosis},3212115
├─{mitosis},3212116
├─{mitosis},3212117
├─{mitosis},3212118
├─{mitosis},3212119
├─{mitosis},3212120
├─{mitosis},3212121
├─{mitosis},3212122
├─{mitosis},3212123
├─{mitosis},3212124
├─{mitosis},3212125
├─{mitosis},3212126
├─{mitosis},3212127
├─{mitosis},3212128
├─{mitosis},3212129
├─{mitosis},3212130
├─{mitosis},3212131
├─{mitosis},3212132
├─{mitosis},3212133
├─{mitosis},3212134
├─{mitosis},3212135
├─{mitosis},3212136
├─{mitosis},3212137
├─{mitosis},3212138
├─{mitosis},3212139
├─{mitosis},3212140
├─{mitosis},3212141
├─{mitosis},3212142
├─{mitosis},3212143
├─{mitosis},3212144
├─{mitosis},3212145
├─{mitosis},3212146
├─{mitosis},3212147
├─{mitosis},3212148
├─{mitosis},3212149
├─{mitosis},3212150
├─{mitosis},3212151
├─{mitosis},3212152
├─{mitosis},3212153
├─{mitosis},3212154
├─{mitosis},3212155
├─{mitosis},3212156
├─{mitosis},3212157
├─{mitosis},3212158
├─{mitosis},3212159
├─{mitosis},3212160
├─{mitosis},3212161
├─{mitosis},3212162
├─{mitosis},3212163
└─{mitosis},3212443
For what its worth, the code in the jupyter notebook counts time with time.process_time(). When a single process is running, individual steps take O(1e3) seconds. When two processes are running, the steps each take O(1e4) seconds. I suppose I could add logging for threading and thread time.
Are you able to share the notebook or a minimal example?