perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

small .h trick for faster win32 interp compiles, cl.exe in the prompt is

Open bulk88 opened this issue 1 year ago • 8 comments

much faster eye. PERL_CORE only, since CPAN XS assumes the built-in out the box headers/tokens/structs/linker libs selected by p5p decades ago, will never change.

bulk88 avatar Oct 06 '24 01:10 bulk88

What does "cl.exe in the prompt is much faster eye" mean?

mauke avatar Oct 06 '24 07:10 mauke

If anyone later does need any of the functionality you're leaving out now, can they do that?

Leont avatar Oct 06 '24 11:10 Leont

What does "cl.exe in the prompt is much faster eye" mean?

I assume it's faster "by eye"

If anyone later does need any of the functionality you're leaving out now, can they do that?

It only limits the APIs declared under PERL_CORE, If we (perl core) need APIs this suppressed we can remove the NO* definition this adds.

The change really needs the defines and inner pre-processor conditionals indented to make what's guarded by the conditionals obvious.

tonycoz avatar Oct 07 '24 23:10 tonycoz

It only limits the APIs declared under PERL_CORE, If we (perl core) need APIs this suppressed we can remove the NO* definition this adds.

Yeah then that doesn't matter.

The change really needs the defines and inner pre-processor conditionals indented to make what's guarded by the conditionals obvious.

I agree that would make it easier to read.

Leont avatar Oct 07 '24 23:10 Leont

I don't have tuits to benchmark the change, but I looked at the timings of GitHub Actions runs on a few recent blead commits.

This is the "Build" step:

Commit msvc142 mingw64
This PR 5m 24s 5m 57s
Currently the latest commit in blead 5m 23s 5m 45s
Latest-1 5m 39s 5m 49s
Latest-2 5m 40s 6m 2s
Latest-3 5m 19s 5m 53s
Latest-4 5m 35s 5m 42s

The numbers are pretty much just noise. I'm not sure what "much faster" means, but I don't see it here.

xenu avatar Oct 08 '24 03:10 xenu

indents fixed, alpha sorted, replaced cargo culted comments, some of the macros, in my VC 2022 SDK (IDK how old it is), they exist, where MS Devs specifically say to use the macro for perf, BUT, the macro is not implemented anywhere else, and doesnt actually do anything. I have no other old or newer SDKs to check. I marked those macros as OBSOL/FUT, since either they are obsolete, or MS will randomly ship the implementation in the monthly VC/SDK rolling release cycle.

Breakage is easy to fix, and only with future unknown MSVCs, and only for core compiles. Typ win users almost never compile Perl interp with the makefile, and instead use Strawberry/prebuilt PM exe/dlls. But they do compile XS modules themselves, so this patch is core only, never cpan.

bulk88 avatar Oct 10 '24 19:10 bulk88

I don't have tuits to benchmark the change, but I looked at the timings of GitHub Actions runs on a few recent blead commits. The numbers are pretty much just noise. I'm not sure what "much faster" means, but I don't see it here.

You are comparing an enterprise cloud server with unlimited cores and unlimited company funds in a datacenter, that is replaced and thrown out every 18 months, vs personal hobby PCs or laptops. A faster recompile of miniperl.exe or perl541.dll binaries alone, improves dev time and lowers the time needed to create a working patch, and people donate that time, unpaid, to Perl core. Saving even 2 or 3 seconds, until you get a C syntax error is worth it, added up over an hour, now thats a few minutes, then hours, and so forth.

4 out of the 5 minutes are spent on the very slow (minutes) /mktables pl script, and the building XS modules and 100 ms startup time of dozens of gmake.exe procs and 40 ms?? per 1000s of cmd.exe proc launches, and 1000s of slow launches of ExtUtils::Command.pm but thats another thread/ticket/talk.

edit: running timeit gmake -j7 CCTYPE=MSVC143 test-prep

379 seconds, 6.3 mins for me

bulk88 avatar Oct 12 '24 18:10 bulk88

win64_build_product_proc_times.pl.txt

Attaching timing, without the patch above, in floating point seconds, of building Win64 perl, 1 core was used/not parallel. 600-1000 ms per .c seems high, and its perl specific, since fcrypt.c which is NO PERL, NO WIN32 headers, is 82 ms.

runperl.c, a 1.06 KB .c file, takes 500 ms to compile. Its not the Visual C compiler, since fcrypt.c was 82 ms. Its perl core specific problems. Refactoring/optimizing is needed.

bulk88 avatar Oct 13 '24 12:10 bulk88

I did some timing on my desktop, a AMD Ryzen 7 2700 (not new, not ancient), Windows 10 with 32GB RAM, SSD main drive (which is where the source and tools were), times in seconds for gmake -j4 CCTYPE=MSVC143 test-prep:

blead@5957de4f26

196.851972818375 197.28115606308 197.164458036423

22642

189.147753953934 189.534616947174 191.680377960205

Not sure why the last is so different, the machine had less than 20% CPU usage when the builds were running, it's possible the system started hitting the disk more for some reason.

I did a git clean -dxfq .. (in win32/) between each build.

So there's some improvement.

tonycoz avatar Oct 22 '24 05:10 tonycoz

Should this large set macros be made a public feature like PERL_NO_GET_CONTEXT or less relevant, NO_XSLOCKS, be documented maybe PERL_NO_GUI_H or PERL_NO_W32_GUI or PERL_NO_W32_GUI_H for XS devs/CPAN? and sprinkle it through P5P /dist modules and add it to ppport.h? it is a long canned list, and the vast majority of XS libs never need to pop up a Win32 GUI window or all these other APIs. perl core "Kernel32.dll" and "libc"/msvcrt/ucrtbase, and rarely winsock, is all that 99% of XS modules use, Win32 specific, or generic cross platform POSIX-ish XS modules.

It always has to be opt-in on CPAN just like PERL_NO_GET_CONTEXT to not break ancient CPAN code.

bulk88 avatar Oct 22 '24 08:10 bulk88

bump

bulk88 avatar Feb 25 '25 19:02 bulk88

bump

bulk88 avatar Mar 21 '25 12:03 bulk88

Some research, I am not the only person complaining #include <windows.h> is unacceptable bloatware for compile times.

Other FOSS are very unhappy at the amount of child .h files automatically loaded, and the amount of CPP and C linker identifiers that get introduced from winbase.h. Read through this ticket https://github.com/JetBrains/kotlin-native/issues/3483 . I'm not going to do a deep dive on that ticket, but I think they have an automated system to strip useless (for them and Perl) junk C linker decls and useless CPP defines from windows.h and winbase.h during their build process, and only let the CC see their private forks of windows.h and winbase.h when they compile their SW.

https://listarchives.boost.org/Archives/boost/2002/02/25800.php another post discussing chopping up the official MS headers during compile time (using Perl!) .

bulk88 avatar Jul 27 '25 05:07 bulk88