perl5
perl5 copied to clipboard
Return WIN32_NO_SOCKETS for miniperl.exe
https://github.com/Perl/perl5/commit/8a548d15292f2166cb07a69fc5fc943391b7fba5
Removed the optimization for miniperl.exe, build speed is important for new code. Bring back the macro for miniperl.exe only. Measureable time savings for me. see commit message.
??????????? maybe, its a small build perf opt
- This set of changes requires a perldelta entry, and it is included.
- This set of changes requires a perldelta entry, and I need help writing it.
- This set of changes does not require a perldelta entry.
Assuming 200 miniperl invocations in a build, it saves 0.22 seconds while making the code more complex.
The delayed sockets initialization is/was thread unsafe from what I can see.
I don't think it's worth the added complexity.
Assuming 200 miniperl invocations in a build, it saves 0.22 seconds while making the code more complex.
miniperl doesnt need access to winsock at anypoint, it never goes on the WWW and it has no use outside of blead/CC libperl time. at the original time of the 1st revision of this PR/patch/branch, this was patch 1 of 2, or 1 of 3, to bring back all of delay winsock features. But I plan a different way of bringing back the "no winsock in miniperl, delay winsock in full perl" feature than the way in revision 1.
The delayed sockets initialization is/was thread unsafe from what I can see.
C function WSAStartup inside ws2_32.dll, was made by MS to be 100% thread race/reentry proof. There is an InterlockedCompareExchange() on a 0 or 1 global var, followed by a InitCritSec(), then an EnterCritSec(). Calling it twice or 2 threads colliding on 2 cores, was recongized by MS from day 1.
The case of 2 random 3rd party different authors DLLs, loaded into one random Win32 process, and both 3P random DLLs executing WSAStartup multiple times b/c they are unaware of each other, probably happens constantly in normal production code/normal win32 GUI or TUI apps all the time. MS knows this and did protect against this inside ws2_32.dll.
If 2 perl ithreads both execute WSAStartup, nothing bad happens. Nobody ever reported a bug when the winsock delay feature did work for many years.
I don't think it's worth the added complexity.
I very strongly disagree. ws2_32.dll ALWAYS loads msvcrt.dll into perl.exe's virtual address space, but perl/perl XS ecosystem uses ucrtbase.dll nowadays. and ws32_32.dll's runtime overhead minimum malloc memory, and its couple 100/maybe 1000 upper end, count of Ring 0 kernel calls, to do DeviceIOCtl() calls to its parent kernel driver afd.sys and ring 0 calls to enumerate a ton of data out of the windows registry, is totally unnecessary for most perl processes.
Also rpcrt4.dll and nsi.dll static linked, and something "ip helper .dll" forgot its real name but ip helper is required to enumerate NICs from user space and find a wired or wifi NDIS NIC to actually open as an object for sockets to work, all 3 absolutely dont need to sit inside perl.exes address space 24/7 and be loaded relocated, and run their DllMains and suck in a bunch of external state they need to operate from their DllMains.
These extra DLLs also increase VS IDE's debugger process start/attach/SEGV debugger attach time by many seconds in the UI for me, because what was 3-4 DLLs 5 years ago, is now 18 EIGHTEEN DLLs inside perl -e"sleep 200;". all these extra DLLs also make various C developer debug and diag tools more difficult to use, because of more stuff or more noise, in the final output of every C .pdb level/asm level, automatic report/log/hook trace log/snapshot tool, that someone wants to use to accomplish a task.
Something I used to be able to do, which I can't do anymore in blead perl, is set a BP on NtAllocateVirtualMemory. The peak temporary or momentary malloc/heapalloc usage of perl.exe on startup is so high now, that NtAllocateVirtualMemory rarely if ever executes again in the rest of the lifetime of the perl from, from inside the main runtime runloop (Perl_runops_standard()) because that HeapAlloc() has enough user-mode R/W free marked 4KB pages to last for the rest of the process, it doesn't need to go back to the kernel and get another unit of 4KB or a unit of 65KB.
winsock loading msvcrt.dll into address space and creating more HeapCreate() objects also makes the C dev user experience more noisy and difficult.
delay loading of winsock was the best core self make test and best perl Makefile.pl and EUMM CPAN toolchain's gmake test speed improvement ever for WinPerl I ever implemented or reimplemented in the past. perl.exe's process startup time, is super important for all devs who work with perl, because short lifespan perl processes are used everywhere in the perl ecosystem and these short lifespan perl processes always have a real time human developer watching their UI/STDOUT.
The startup time of nginix/lightspeed doesnt affect human developer time. .t running and EUMM running does affect human developer time. 5 seconds to 4 seconds, time 20-100 times a day adds upto minutes, then consider those couple minutes a day, times all perl devs/users on earth.
I plan to reimplement the delay winsock loading feature in a totally different way than the way above/the previous way, without 20-45 StartSocket() tokens all over the win32-only code base. I do recognize WSAStartup() needs to execute before the first socket FD/object is created. Its irrelevant what winsock actually does on startup and when exactly it runs its 1x startup logic, but that 1x startup logic has huge overhead if a process will never create a socket or touch the winsock dll again for the rest of th proc lifespan.
Technically MS has a choice between DllMain or WSAStartup for 1x startup logic but its irrelavent what is done where, since nobody can recompile or publish a MS made system DLL. But theoretical things winsock must do 1x on startup, are at minimum enumerate the PCIe NICs and ethernet frame protocol handlers from the registry, register the current PID with afd.sys, and set up its own private TlsGetValue TlsSetValue slot, and go digging through address space if it can find a user32.dll in address space, and create its invisible GUI Window object/mess around with the Win32 GUI message packet event loop system.
I dont think winsock does this in real life, but perhaps it also needs to talk to csrss.exe as part of its 1x startup code, using that rpcrt4.dll it loads, but again, the details don't matter since its a MS compiled DLL.
Not loading the winsock DLL unless the process is going to communicate on WWW fixes all problems instantly.
Why its a bad decision for WinPerl to unconditionally have Winsock library always loaded and running in a WinPerl process. Winsock static runtime links to the "secret" msvcrt.dll CRT. One side effect is, msvcrt.dll registers a bunch of its own C89/C99/C++ global object destructor methods with ntdll.dll. Regardless how the user mode process tries to exit itself, either by ExitProcess() or UCRT exit(), those msvcrt.dll global object destructor methods WILL be fired by the WinNT kernel in Ring 3 - 0.01 or Ring 3 - 1.99, depending on your personal opinion of what MS's ntdll.dll is and does in the SW stack.
I have no easy way non-ASM way to benchmark the wall clock time cost of these destructors. Its probably 500 us to 1.5 ms max, maybe 5-10 ms on ancient HW. But there has to be a cost to that 1 millisecond, such as cpan.pl or make test sleeping on a pipe 2-3 millisecond longer during waitpid() per child Perl process.
> ntdll.dll!RtlEnterCriticalSection() Unknown
msvcrt.dll!_freefls() Unknown
ntdll.dll!RtlProcessFlsData() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
miniperl.exe!sig_terminate(int sig) Line 2792 C
miniperl.exe!win32_ctrlhandler(unsigned long dwCtrlType) Line 5207 C
kernel32.dll!CtrlRoutine() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown
update: how nice, both UCRT and msvcrt.dll are both aware that WinPerl constantly does _wsetlocale() calls to UCRT.
update: how nice, both UCRT and msvcrt.dll are both aware that WinPerl constantly does
_wsetlocale()calls to UCRT.
More destructors that get fired from DLLs that are not very useful to a TUI WinPerl process. imm32.dll is Win32's "Input Method Editor" library, On-Screen keyboards, ADA, etc. The reason its loaded is either or both user32.dll's fault, or winsock's fault.
> imm32.dll!ImmDllInitialize() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
miniperl.exe!sig_terminate(int sig) Line 2792 C
miniperl.exe!win32_ctrlhandler(unsigned long dwCtrlType) Line 5207 C
kernel32.dll!CtrlRoutine() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown
Another stack, I dont remember what this DLL does off the top of my head, I think its an out-of-process RPC/message passing API between the consumer process and on screen keyboard producer process.
> ntdll.dll!RtlFreeHeap() Unknown
KernelBase.dll!LocalFree() Unknown
msctf.dll!ProcessDetach(struct HINSTANCE__ *) Unknown
msctf.dll!DllMain() Unknown
msctf.dll!_CRT_INIT() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
miniperl.exe!sig_terminate(int sig) Line 2792 C
miniperl.exe!win32_ctrlhandler(unsigned long dwCtrlType) Line 5207 C
kernel32.dll!CtrlRoutine() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown
> ntdll.dll!NtClose() Unknown
ntdll.dll!EtwpUnregisterProvider() Unknown
ntdll.dll!EtwNotificationUnregister() Unknown
ntdll.dll!EtwUnregisterTraceGuids() Unknown
msctf.dll!McGenEventUnregister() Unknown
msctf.dll!ProcessDetach(struct HINSTANCE__ *) Unknown
msctf.dll!DllMain() Unknown
msctf.dll!_CRT_INIT() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
Jeez. Why bother manually nuking Ring 0 opaque handles on process exit event? Perl has PERL_DESTRUCT_LEVEL, but that concept isn't a MS API design pattern.
> ucrtbase.dll!__crt_seh_guarded_call<void>::operator()<class <lambda_886d6c58226a84441f68b9f2b8217b83>,class <lambda_ab61a845afdef5b7c387490eaf3616ee> &,class <lambda_f7f22ab5edc0698d5f6905b0d3f44752> >(class <lambda_886d6c58226a84441f68b9f2b8217b83> &&,class <lambda_ab61a845afdef5b7c387490eaf3616ee> &,class <lambda_f7f22ab5edc0698d5f6905b0d3f44752> &&) Unknown
ucrtbase.dll!common_flush_all() Unknown
ucrtbase.dll!DllMainProcessDetach() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
miniperl.exe!sig_terminate(int sig) Line 2792 C
miniperl.exe!win32_ctrlhandler(unsigned long dwCtrlType) Line 5207 C
kernel32.dll!CtrlRoutine() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown
> comctl32.dll!_lock() Unknown
comctl32.dll!_fflush_nolock() Unknown
comctl32.dll!__endstdio() Unknown
comctl32.dll!__crtExitProcess() Unknown
comctl32.dll!_cinit() Unknown
comctl32.dll!__CRT_INIT() Unknown
comctl32.dll!_CRT_INIT() Unknown
ntdll.dll!LdrShutdownProcess() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
ucrtbase.dll!exit_or_terminate_process() Unknown
ucrtbase.dll!common_exit() Unknown
miniperl.exe!sig_terminate(int sig) Line 2792 C
miniperl.exe!win32_ctrlhandler(unsigned long dwCtrlType) Line 5207 C
kernel32.dll!CtrlRoutine() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown
Wow Comctl32.dll has a private static linked copy of the MS CRT/Libc inside of it. I didn't know that until now. What is Comctl32.dll? just search RT/GH/ML archives for my dislike of it. Miniperl.exe loading it is a legit bug to fix on my todo list, since that binary is not capable of drawing a Win32 GUI widget whatsoever.