FrameworkBenchmarks
FrameworkBenchmarks copied to clipboard
New Server Set up
Good morning, friends!
We are working through some issues with the new servers. Nothing serious, but it's required ordering some extra parts/cables and the delay will be a bit longer. I appreciate everyone's patience while we work through this. We're attempting to get the 40-gigabit fiber setup working, some power issues, and the SFP connectors don't fit in our current enclosure.
Hi!
Could you @NateBrady23 please share the specs of the new servers? My framework requires some manual tuning of its configuration for the best performance, and I'd like to do that upfront, if possible.
HI, the good fact will be to show, the frameworks that work better without any change !! And that need to be an enhancement to any framework !!
@NateBrady23 please run the first run with the new servers, with the last full run commit: [0ec8ed488ec87718eaee9ed05c0ffd51ca48113b] (https://github.com/TechEmpower/FrameworkBenchmarks/tree/0ec8ed488ec87718eaee9ed05c0ffd51ca48113b)
And later we need to show the last run id, from both servers.
:confused:
please we need more info:
We undersstand that you are busy, but please send news !!
And that need to be an enhancement to any framework !!
In general I agree, but I prefer to tune things for the extreme use-cases, and benchmarking is definitely one of such cases. Users of my framework (myself included) are fine with tuning it for their specific production workloads, and if what you maintain hits its best numbers for any workload possible without even a slight manual tuning -- that's a thing to be really proud of, I think.
please run the first run with the new servers, with the last full run commit
This I second
All machines are identical with these specs
Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz 56 logical cores, 1 socket, 1 NUMA, 64 GB RAM 40Gbit/s network SSD 960GB
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 56
On-line CPU(s) list: 0-55
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 1
Stepping: 6
CPU max MHz: 3100.0000
CPU min MHz: 800.0000
BogoMIPS: 4000.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fx
sr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts re
p_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_t
imer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single
ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase ts
c_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma
clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_
llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pt
s hwp hwp_act_window hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq av
x512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_ca
pabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 1.3 MiB (28 instances)
L1i: 896 KiB (28 instances)
L2: 35 MiB (28 instances)
L3: 42 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-55
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Srbds: Not affected
Tsx async abort: Not affected
Network
description: Ethernet interface
product: MT28908 Family [ConnectX-6]
vendor: Mellanox Technologies
physical id: 0
bus info: pci@0000:10:00.0
logical name: ens1f0np0
version: 00
capacity: 40Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical fibre 1000bt-fd 10000bt-fd 25000bt-fd 40000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=5.15.0-73-generic duplex=full firmware=20.33.1048 (MT_0000000594) ip=10.0.0.121 latency=0 link=yes multicast=yes port=fibre
resources: irq:18 memory:b0000000-b1ffffff memory:b2000000-b20fffff
Mellanox!? Juicy!
Sounds great! While the faster network won't help with the majority of the tests (only the cached queries and plaintext tests should see an improvement, and maybe the fortunes one, since it was doing around 5 Gb/s of network traffic, if I am not mistaken), the doubling of the cores and the jump from the Skylake to the Ice Lake microarchitecture should (the latter should not require Spectre mitigations that are as harsh, I believe).
56 physical cores
It is actually 28 cores and 56 threads, visible from the lscpu
output.
It is actually 28 cores and 56 threads, visible from the lscpu output.
Right, my comment is wrong.
Even for a corporation, it is a pretty huge and unusual setup, especially the network part.
Only the SSD is a weird chose: a SATA version for database process? In 2024? Really?
Thanks for providing the update @sebastienros! Sorry this setup is taking so long. It's been a matter of ordering things and people in the office at the right time to work on it. @msmith-techempower is doing some work with this today and I'm in on Thursday.
Just as a general update - I am really trying to get these up and working, but the going is slow given that I am not an IT professional by trade 😅. I know everyone, myself included, is anxious to get the continuous runs back up as soon as possible, and I don't want anyone thinking we are sitting on our hands.
Another update - we have gotten the machines mostly spun up and verified (using iperf
as a baseline) the 40Gbps connections over fiber. We are still trying to get each machine able to connect to the internet (which has been a slog, but I think the hardware for it should be arriving today), but once that is done we will start in on the software side of setup.
Thank you to everyone for being so patient, but I am seeing light at the end of this tunnel and hope to have runs started back up soon.
@NateBrady23 please run the first run with the new servers, with the last full run commit
I second this as I updated my benchmarks in the meantime and would love to see the impact independent from the hardware changes.
Looking forward to the new environment, keep up the good work!
I get that you guys are just about across the finish line. But I recommend updating the announement banner at the top of https://tfb-status.techempower.com/ anyway. It's a one-liner in your website's HTML (aside from publishing the change). This will encourage thousands of your site's followers and, regardless, "better late than never".
@joanhey @Kaliumhexacyanoferrat Yes, the first real run from the new servers will be with the last full run's commit. Great idea.
Pinging @msmith-techempower ^
We got the "final" parts in on Friday evening at the office. Mike, give us hope for Monday or Tuesday! 🙏
Hardware install complete and "flash point" tested. Everything appears to be working correctly, and one of our major concerns appears to be okay (issue with power draw). Tomorrow, I'll be getting the software environments up and running and HOPEFULLY (not promising anything - yes, you Nate) get the parity commit run started. I am sure there will be more to fix/hone/etc. in the coming week or two, but we are slowly getting the new environment on its feet.
Again, thank you all for your continued patience!
What version of Ubuntu are you using? 24.04 is almost there...
February 29, 2024 – Feature Freeze March 21, 2024 – User Interface Freeze April 4, 2024 – Ubuntu 24.04 Beta April 11, 2024 – Kernel Freeze April 25, 2024 – Ubuntu 24.04 LTS Released
We have 22 atm, but it may end up prudent to move to 24 when it's released since it's LTS.
Are you using the regular kernel or the Hardware Enablement (HWE) one, as I suggested here? Using the HWE kernel essentially eliminates the need to move to Ubuntu 24.04 (when it is out) until possibly early 2025 because it would be updated to the same release as the one that 24.04 is based on, and IMHO the differences due to other software components amount to a rounding error. The switch to the HWE is done with a simple command and a reboot.
HWE
HOWDY! Okay, I believe that we have a run started. So far, nothing seems out of the ordinary, so we will see how it plays out over the next few days.
In the meantime, please be aware that this is a first attempt, and there are sure to be issues that creep up. Please report those issues here, and we will trudge on!
Again, thank you for your continued patience!
Same run with commit https://github.com/TechEmpower/FrameworkBenchmarks/tree/625684fcc442767af013de2dfd1fc90dd73f1744 That is the code and data in Round 22.
Old servers https://tfb-status.techempower.com/results/66d86090-b6d0-46b3-9752-5aa4913b2e33
New servers ~https://tfb-status.techempower.com/results/1aefa081-5641-4e7a-a712-e85c4bf3a4e1~ https://tfb-status.techempower.com/results/cdec9eaf-19ea-48d2-bfa4-df15afbe3236
About the kernels: Last Ubuntu 22.04.4 (February 2024) change to Kernel 6.15 (from 5.15) https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle We didn't see this change !!
New Ubuntu 24.04 come with Kernel 6.8. And the next Ubuntu 22.04.5 also it will come with 6.8 (after 24.04).
Network-related: Linux 6.8 includes networking buffs that provide better cache efficiency. This is said to improve “TCP performances with many concurrent connections up to 40%” – a sizeable uplift, though to what degree most users will benefit is unclear.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3e7aeb78ab01
We want it, but we will check it !!
The actual run is stuck !!
Yes, the page is not refreshed since yesterday:
last updated 2024-03-27 at 4:02 PM
https://tfb-status.techempower.com/
Confirmed - I am looking into it now. Appears to have been a thermal issue on the primary machine. About 4 hours (I think) into the run the machine shut itself down.
Ok things are back up and running and we're still monitoring.
Just so you guys know, all of us at techempower get an email when the citrine environment stops getting updates. You don't have to add to the thread or open issues when it crashes; it may happen a few more times. But appreciate everyone's enthusiasm!
OKAY.
Little update. TechEmpower is located in a small office and we do not have a dedicated server rack any longer - we bought a small rack that has insulation (it's very loud), but that resulted in the switch being too close to the app server... and it produces a TON of heat which, in turn, tripped the heat sensor on the intake of the machine, which fired off a safety shutdown.
I fiddled with a bunch of setups, but what seems to be working at the moment is having the switch powered down, and plugging the fiber directly. So, App is connected to Database on 10.0.0.x, and App is connected to Client on 10.0.1.x. I tested this setup with iperf
as I did with the switch and saw not appreciable difference in throughput, so I am hoping this is a fair way to test. VERY OPEN TO COMMENT HERE!
Anyway, the current run has benchmarked a couple, I am monitoring temperature (among other stats) while it is running, and hopefully we will be okay moving forward.
Have no fear, the continuous run is still going on and everything looks healthy! Just an issue with tfb-status receiving updates. Should be fixed shortly.
FYI: The parity run we're doing is with Round 22 https://tfb-status.techempower.com/results/66d86090-b6d0-46b3-9752-5aa4913b2e33
I'll be out early next week; when this run completes, it will automatically start a new run from the current state of the repo.
Impressive numbers !! We'll need some time to analyze the numbers.
I think that will be good to create Round 22N
, so the regular visitors can see the difference.
Also it will be better to compare with Round 23
.