v icon indicating copy to clipboard operation
v copied to clipboard

V code significantly worse (~145x slower) than a similar Go code.

Open rcsaquino opened this issue 1 year ago • 30 comments

Describe the bug

The way I learn concurrency when experimenting with new programming languages is by rewriting a code I've made in the past which is supposed to calculate the best word to guess based on the words listed on an online game called wordle. This is the V code and this is the Go code. I've experimented with the code a lot but found no way of improving the efficiency of the V code without changing the code much.

On my machine, V code executes ~8164 ms while the Go code executes ~56ms. Repeated runs shows ~145x slower V code than Go.

Expected Behavior

I expect V to be faster than Go. If not, they at least should be on par.

Current Behavior

V code performs significantly worse (~145x slower) than Go code,

Reproduction Steps

Run V with v -prod main.v then run executable main.exe

Run Go with go run main.go

Possible Solution

No response

Additional Information/Context

Here's the json file needed to run the code if you want to do your own testing.

V version

V 0.3.3

Environment details (OS name and version, etc.)

V full version: V 0.3.3 d00237f.6e1e406 OS: windows, Microsoft Windows 11 Home Single Language v22621 64-bit Processor: 12 cpus, 64bit, little endian,

getwd: C:\Users\user vexe: C:\Users\user\v\v.exe vexe mtime: 2023-03-17 19:02:20

vroot: OK, value: C:\Users\user\v VMODULES: OK, value: C:\Users\user.vmodules VTMP: OK, value: C:\Users\user\AppData\Local\Temp\v_0

Git version: git version 2.39.1.windows.1 Git vroot status: weekly.2023.11-18-g6e1e4062 (10 commit(s) behind V master) .git/config present: true

CC version: Error: exec failed (CreateProcess) with code 2: The system cannot find the file specified. cmd: cc --version thirdparty/tcc status: thirdparty-windows-amd64 e90c2620

rcsaquino avatar Mar 18 '23 08:03 rcsaquino

On my machine:

V

 > arose: 19383 | stare: 19362 | later: 19269 | alter: 19219 | irate: 19215 |
Process took 312 ms

Go

Best Words:
 > arose: 19383 | stare: 19362 | later: 19269 | alter: 19219 | irate: 19215 |
Process took 84 ms

3.7 times slower, probably due to Go's more efficient goroutines. We're getting those soon.

You're most likely not using clang/msvc when doing prod builds.

Run

v -showcc -prod main.v

medvednikov avatar Mar 18 '23 08:03 medvednikov

Here's the output. Additional information: I am on Windows 11 and Intel i7-10750H.

$ v -showcc -prod main.v

> C compiler cmd: "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64\cl.exe" "@C:\Users\user\AppData\Local\Temp\v_0\main.7757256283580746926.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.7757256283580746926.tmp.c.rsp":
-w /we4013 /volatile:ms /Fo"C:\Users\user\AppData\Local\Temp\v_0\main.7757256283580746926.tmp.c.obj" /F 16777216 /O2 /MD /DNDEBUG "C:\Users\user\AppData\Local\Temp\v_0\main.7757256283580746926.tmp.c" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\ucrt" -I "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\include" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\um" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared" /DGC_NOT_DLL=1 /DGC_WIN32_THREADS=1 /DGC_THREADS=1 -I "C:\Users\user\v\thirdparty\libatomic_ops" -I "C:\Users\user\v\thirdparty\libgc\include" -I "C:\Users\user\v\thirdparty\cJSON" -I "C:\Users\user\v\thirdparty\stdatomic\win" "C:\Users\user\v/thirdparty/libatomic_ops/atomic_ops.obj" "C:\Users\user\v/thirdparty/libgc/gc.obj" "C:\Users\user\v/thirdparty/cJSON/cJSON.obj" kernel32.lib user32.lib advapi32.lib dbghelp.lib advapi32.lib /link /NOLOGO /OUT:"C:\Users\user\Desktop\exp\wordle_assist\main.exe" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\ucrt\X64" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\um\X64" /LIBPATH:"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\lib\X64" /DEBUG:FULL /INCREMENTAL:NO /OPT:REF /OPT:ICF

rcsaquino avatar Mar 18 '23 08:03 rcsaquino

So you are using msvc.

Could be an msvc thing. Can you try to install clang (V's docs have info on how to do it) and try with -cc clang ?

medvednikov avatar Mar 18 '23 08:03 medvednikov

On Windows you can also get AV scanners during program runtime. We had lots of slowdowns reported because of that.

It's best to benchmark on non Windows systems.

medvednikov avatar Mar 18 '23 08:03 medvednikov

Clang performed slightly better at ~7369 ms but still too slow when compared to Go’s ~55ms. I began testing yesterday and still getting the same results today after a computer restart. I don't have any other AVs installed other than Windows Defender.

$ v -showcc -prod -cc clang main.v

> C compiler cmd: clang "@C:\Users\user\AppData\Local\Temp\v_0\main.14896488175087772503.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.14896488175087772503.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\e8\\e83b953f47be758d8e773328e677f974.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\cc\\ccbc33317a32e7489fbeb6a13b484985.module.json.o" -o "C:\\Users\\user\\Desktop\\exp\\wordle_assist\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.14896488175087772503.tmp.c" -municode -ldbghelp -ladvapi32

rcsaquino avatar Mar 18 '23 08:03 rcsaquino

very weird

if you have a linux vm/machine, please try there

medvednikov avatar Mar 18 '23 08:03 medvednikov

On my Linux machine V (-cc clang -prod) took 1324ms while go took 179ms

squidink7 avatar Mar 18 '23 09:03 squidink7

Tried on a linux mint usb and with v -prod -cc clang main.v, the code executed at ~700ms. Weird that it has that much difference when compared to a windows machine. Also tried it on Go which shows roughly similar performance to the windows counterpart executing at ~60ms.

rcsaquino avatar Mar 18 '23 09:03 rcsaquino

It's definitely the antivirus then.

medvednikov avatar Mar 18 '23 10:03 medvednikov

It is silly sugest, but try -gc none (with -autofree) at compilation and see what happen ..., I think it will help ... My experience is that the current gc not allow true parallel code ...

MatejMagat305 avatar Mar 18 '23 12:03 MatejMagat305

I initially tried with autofree and it gave a C compiler error, so I let it fall back on the gc

squidink7 avatar Mar 18 '23 14:03 squidink7

Upon further debugging, I might've found the partial culprit (at least on my windows machine). I have to specify -cc msvc. I don't know why because doing -showcc shows I'm using msvc by default without the flag, but i just tried adding -cc msvc and the execution speed went faster down to ~247ms. It's still slower than Go but a significant improvement from the initial ~8164ms execution time.

For reference here are the outputs of -showcc

Without -cc msvc: ~ 7784 ms

$ v -prod -showcc main.v

> C compiler cmd: "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64\cl.exe" "@C:\Users\user\AppData\Local\Temp\v_0\main.769321436697931283.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.769321436697931283.tmp.c.rsp":
-w /we4013 /volatile:ms /Fo"C:\Users\user\AppData\Local\Temp\v_0\main.769321436697931283.tmp.c.obj" /F 16777216 /O2 /MD /DNDEBUG "C:\Users\user\AppData\Local\Temp\v_0\main.769321436697931283.tmp.c" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\ucrt" -I "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\include" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\um" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared" /DGC_NOT_DLL=1 /DGC_WIN32_THREADS=1 /DGC_THREADS=1 -I "C:\Users\user\v\thirdparty\libatomic_ops" -I "C:\Users\user\v\thirdparty\libgc\include" -I "C:\Users\user\v\thirdparty\cJSON" -I "C:\Users\user\v\thirdparty\stdatomic\win" "C:\Users\user\v/thirdparty/libatomic_ops/atomic_ops.obj" "C:\Users\user\v/thirdparty/libgc/gc.obj" "C:\Users\user\v/thirdparty/cJSON/cJSON.obj" kernel32.lib user32.lib advapi32.lib dbghelp.lib advapi32.lib /link /NOLOGO /OUT:"C:\Users\user\Desktop\exp\wordle_assist\main.exe" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\ucrt\X64" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\um\X64" /LIBPATH:"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\lib\X64" /DEBUG:FULL /INCREMENTAL:NO /OPT:REF /OPT:ICF

With -cc msvc: ~ 247 ms

$ v -prod -showcc -cc msvc main.v

> C compiler cmd: "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64\cl.exe" "@C:\Users\user\AppData\Local\Temp\v_0\main.13963805996637790909.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.13963805996637790909.tmp.c.rsp":
-w /we4013 /volatile:ms /Fo"C:\Users\user\AppData\Local\Temp\v_0\main.13963805996637790909.tmp.c.obj" /F 16777216 /O2 /MD /DNDEBUG "C:\Users\user\AppData\Local\Temp\v_0\main.13963805996637790909.tmp.c" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\ucrt" -I "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\include" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\um" -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared" -I "C:\Users\user\v\thirdparty\cJSON" -I "C:\Users\user\v\thirdparty\stdatomic\win" "C:\Users\user\v/thirdparty/cJSON/cJSON.obj" kernel32.lib user32.lib advapi32.lib dbghelp.lib advapi32.lib /link /NOLOGO /OUT:"C:\Users\user\Desktop\exp\wordle_assist\main.exe" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\ucrt\X64" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\um\X64" /LIBPATH:"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\lib\X64" /DEBUG:FULL /INCREMENTAL:NO /OPT:REF /OPT:ICF

Also I would like to note that while debugging this, I found out that the latest version of clang and gcc from this site does not work with V. Also some versions have a working clang but non working gcc, while others have vise versa. I have yet to find a version with both gcc and clang working. Eventually I just scrapped clang and gcc and just stuck with msvc. This is when I found out about this bug that specifying -cc msvc could improve the performance by a ton, even if the -showcc says you compile with msvc by default. This is true at least on my Windows machine.

rcsaquino avatar Mar 18 '23 14:03 rcsaquino

Just curious, what is the content of wordle_min.json ?

Edit: just saw that you linked it - https://ghostbin.me/641574ba443d6 .

spytheman avatar Mar 18 '23 14:03 spytheman

The problem has nothing to do with channels.

fn contains_char(s string, c u8) bool {
	for x in s {
		if x == c {
			return true
		}
	}
	return false
}

and then: } else if contains_char(possible_answers[y], x_char) {

If I use the above, instead of } else if possible_answers[y].contains(x_char.ascii_str()) { Then:

#0 17:31:03 ᛋ cleanup_toml_autocast /v/vnew❱./v -prod -cc gcc-11 wordle.v 
#0 17:34:02 ᛋ cleanup_toml_autocast /v/vnew❱
#0 17:34:05 ᛋ cleanup_toml_autocast /v/vnew❱go build wordle_go.go 
#0 17:34:09 ᛋ cleanup_toml_autocast /v/vnew❱
#0 17:34:10 ᛋ cleanup_toml_autocast /v/vnew❱./wordle_go 
thread_count: 4
chunk_size: 579
Go thread possible_answers.len: 2315 | from: 1737 | to: 2315
Go thread possible_answers.len: 2315 | from: 0 | to: 579
Go thread possible_answers.len: 2315 | from: 579 | to: 1158
Go thread possible_answers.len: 2315 | from: 1158 | to: 1737
Best Words:
 > arose: 19383 | stare: 19362 | later: 19269 | alter: 19219 | irate: 19215 | 
Process took 201 ms
#0 17:34:15 ᛋ cleanup_toml_autocast /v/vnew❱./wordle
Started...
thread_count: 4
chunk_size: 579
spawned thread possible_answers.len: 2315 | from: 0 | to: 579
spawned thread possible_answers.len: 2315 | from: 579 | to: 1158
spawned thread possible_answers.len: 2315 | from: 1158 | to: 1737
spawned thread possible_answers.len: 2315 | from: 1737 | to: 2315
Best Words:
 > arose: 19383 | stare: 19362 | later: 19269 | alter: 19219 | irate: 19215 | 
Process took 80 ms
#0 17:34:22 ᛋ cleanup_toml_autocast /v/vnew❱

i.e. the V version becomes faster than the Go one.

I think we just need to add the above specialized function as a string method too, for searching a single byte in a string, potentially returning its index, instead of a bool.

Alternatively, we can add that as a special case for the already existing index_ method, when the needle string is of .len == 1, then all the rest of the methods that call that will be faster for that situation too.

spytheman avatar Mar 18 '23 15:03 spytheman

Here is the faster version for your amusement :-) https://gist.github.com/spytheman/2b9fc468664f0ac73b258c065daf577e

spytheman avatar Mar 18 '23 15:03 spytheman

Here is the faster version for your amusement :-) https://gist.github.com/spytheman/2b9fc468664f0ac73b258c065daf577e

Wow! Combining -cc msvc with your code, I got my execution speed down to ~80ms. Still slower than the Go code on my machine, but definately a significant improvement! Thank you for this input. I wonder why string.contains() method runs slow?

rcsaquino avatar Mar 18 '23 15:03 rcsaquino

Thanks @spytheman, I missed that.

I assume it's because you're running x_char.ascii_str() for each iteration @rcsaquino.

It does an allocation for each new string.

medvednikov avatar Mar 18 '23 16:03 medvednikov

latest version of clang and gcc from this site does not work with V

what's the error you're getting?

medvednikov avatar Mar 18 '23 16:03 medvednikov

Implemented in https://github.com/vlang/v/pull/17702 .

spytheman avatar Mar 18 '23 16:03 spytheman

To use .contains(), you have to also call .ascii_str() for each character, and that in turn allocates a new string. The .contains_u8() version in the PR from above, allows you to just pass the single byte, without any other conversions and allocations.

Edit: @medvednikov already guessed the reason.

spytheman avatar Mar 18 '23 16:03 spytheman

I assume it's because you're running x_char.ascii_str() for each iteration @rcsaquino.

That's handy to know. Thank you! I hope I'm not steering too far away from the issue but is -showcc bugged? As mentioned in my previous comment, that's also one of the reasons why my code ran so slow. It tells me that I'm using msvc by default when on -prod, but then when I specifically flag -cc msvc I get a much faster execution time.

what's the error you're getting?

GCC 12.2.0 + LLVM/Clang/LLD/LLDB 15.0.7 + MinGW-w64 10.0.0 (UCRT) - release 4

$ v -prod -showcc -cc gcc main.v

> C compiler cmd: gcc "@C:\Users\user\AppData\Local\Temp\v_0\main.2035843803477250444.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.2035843803477250444.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -fno-strict-aliasing -flto -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\05\\050fbb8d29eed490979785f426a1efd5.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\c9\\c991825ae2ad35e38a68460692c7e8c0.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.2035843803477250444.tmp.c" -municode -ldbghelp -ladvapi32
==================
c:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\user\Bok\GitHub\is_v_fast_yet\main.exe(.pdata): relocation ".uw_base+0x0 (type R_X86_64_RELATIVE)" goes out of range
collect2.exe: error: ld returned 1 exit status
...
==================
(Use `v -cg` to print the entire error message)

builder error:
==================
C error. This should never happen.

This is a compiler bug, please report it using `v bug file.v`.

https://github.com/vlang/v/issues/new/choose

You can also use #help on Discord: https://discord.gg/vlang
$ v -prod -showcc -cc clang main.v

> C compiler cmd: clang "@C:\Users\user\AppData\Local\Temp\v_0\main.10927453796187017163.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.10927453796187017163.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\0c\\0cb8877431328c0e4563d9b7349d84ff.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\0f\\0f3d1273c6d2b75313301cbcf05ae747.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.10927453796187017163.tmp.c" -municode -ldbghelp -ladvapi32
==================

 ^
C:\Users\user\AppData\Local\Temp\v_0\main.10927453796187017163.tmp.c:10437:10: error: incompatible integer to pointer conversion initializing 'voidptr' (aka 'void *') with an expression of type 'DWORD' (aka 'unsigned long') [-Wint-conversion]
        voidptr res = FormatMessage(((FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM) | FORMAT_MESSAGE_IGNORE_INSERTS), NULL, err_msg_id, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), &msgbuf, 0, NULL);
                ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\user\AppData\Local\Temp\v_0\main.10927453796187017163.tmp.c:17008:28: warning: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'LARGE_INTEGER *' (aka 'union _LARGE_INTEGER *') [-Wincompatible-pointer-types]
        QueryPerformanceFrequency(&f);
                                  ^~
C:/mingw64/x86_64-w64-mingw32/include/profileapi.h:17:71: note: passing argument to parameter 'lpFrequency' here
  WINBASEAPI WINBOOL WINAPI QueryPerformanceFrequency (LARGE_INTEGER *lpFrequency);
                                                                      ^
C:\Users\user\AppData\Local\Temp\v_0\main.10927453796187017163.tmp.c:17014:26: warning: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'LARGE_INTEGER *' (aka 'union _LARGE_INTEGER *') [-Wincompatible-pointer-types]
        QueryPerformanceCounter(&s);
...
==================
(Use `v -cg` to print the entire error message)

builder error:
==================
C error. This should never happen.

This is a compiler bug, please report it using `v bug file.v`.

https://github.com/vlang/v/issues/new/choose

You can also use #help on Discord: https://discord.gg/vlang

GCC 11.3.0 + LLVM/Clang/LLD/LLDB 14.0.3 + MinGW-w64 10.0.0 (UCRT) - release 3

$ v -prod -showcc -cc gcc main.v

> C compiler cmd: gcc "@C:\Users\user\AppData\Local\Temp\v_0\main.1732415297703188569.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.1732415297703188569.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -fno-strict-aliasing -flto -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\05\\050fbb8d29eed490979785f426a1efd5.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\c9\\c991825ae2ad35e38a68460692c7e8c0.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.1732415297703188569.tmp.c" -municode -ldbghelp -ladvapi32
==================
      |                                                     ^
lto1.exe: fatal error: bytecode stream in file 'C:\Users\user\.vmodules\cache\05\050fbb8d29eed490979785f426a1efd5.module.builtin.o' generated with LTO version 12.0 instead of the expected 11.3
compilation terminated.
lto-wrapper.exe: fatal error: gcc returned 1 exit status
compilation terminated.
c:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/11.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: error: lto-wrapper failed
collect2.exe: error: ld returned 1 exit status
...
==================
(Use `v -cg` to print the entire error message)

builder error:
==================
C error. This should never happen.

This is a compiler bug, please report it using `v bug file.v`.

https://github.com/vlang/v/issues/new/choose

You can also use #help on Discord: https://discord.gg/vlang
$ v -prod -showcc -cc clang main.v

> C compiler cmd: clang "@C:\Users\user\AppData\Local\Temp\v_0\main.4326394698301975376.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.4326394698301975376.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\0c\\0cb8877431328c0e4563d9b7349d84ff.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\0f\\0f3d1273c6d2b75313301cbcf05ae747.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.4326394698301975376.tmp.c" -municode -ldbghelp -ladvapi32

This works!

GCC 10.3.0 + LLVM/Clang/LLD/LLDB 12.0.0 + MinGW-w64 9.0.0 (MSVCRT) - release 2

$ v -prod -showcc -cc gcc main.v'

> C compiler cmd: gcc "@C:\Users\user\AppData\Local\Temp\v_0\main.10133487570940163088.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.10133487570940163088.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -fno-strict-aliasing -flto -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\05\\050fbb8d29eed490979785f426a1efd5.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\c9\\c991825ae2ad35e38a68460692c7e8c0.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.10133487570940163088.tmp.c" -municode -ldbghelp -ladvapi32
==================
      |                                                     ^
lto1.exe: fatal error: bytecode stream in file 'C:\Users\user\.vmodules\cache\05\050fbb8d29eed490979785f426a1efd5.module.builtin.o' generated with LTO version 12.0 instead of the expected 9.3
compilation terminated.
lto-wrapper.exe: fatal error: gcc returned 1 exit status
compilation terminated.
c:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: error: lto-wrapper failed
collect2.exe: error: ld returned 1 exit status
...
==================
(Use `v -cg` to print the entire error message)

builder error:
==================
C error. This should never happen.

This is a compiler bug, please report it using `v bug file.v`.
$ v -prod -showcc -cc clang main.v

> C compiler cmd: clang "@C:\Users\user\AppData\Local\Temp\v_0\main.10255458287298117242.tmp.c.rsp"
> C compiler response file "C:\Users\user\AppData\Local\Temp\v_0\main.10255458287298117242.tmp.c.rsp":
  -std=c99 -D_DEFAULT_SOURCE -O3 -DNDEBUG "C:\\Users\\user\\.vmodules\\cache\\0c\\0cb8877431328c0e4563d9b7349d84ff.module.builtin.o" "C:\\Users\\user\\.vmodules\\cache\\0f\\0f3d1273c6d2b75313301cbcf05ae747.module.json.o" -o "C:\\Users\\user\\Bok\\GitHub\\is_v_fast_yet\\main.exe" -Wl,-stack=16777216 -Werror=implicit-function-declaration -D GC_NOT_DLL=1 -D GC_WIN32_THREADS=1 -D GC_BUILTIN_ATOMIC=1 -D GC_THREADS=1 -I "C:\\Users\\user\\v\\thirdparty\\libgc\\include" -I "C:\\Users\\user\\v\\thirdparty\\cJSON" -I "C:\\Users\\user\\v\\thirdparty\\stdatomic\\win" "C:\\Users\\user\\AppData\\Local\\Temp\\v_0\\main.10255458287298117242.tmp.c" -municode -ldbghelp -ladvapi32

This works!

rcsaquino avatar Mar 18 '23 16:03 rcsaquino

Wow! Combining -cc msvc with your code, I got my execution speed down to ~80ms.

I tested on Ubuntu 20.04, with Intel i3-3225, and a mostly quiet system: image

spytheman avatar Mar 18 '23 16:03 spytheman

Without -prod (i.e. with tcc): image

which is still faster than before, but slower than the Go version (which always does optimizations, while tcc does not, so that is expected).

spytheman avatar Mar 18 '23 16:03 spytheman

No, -showcc is not bugged - it simply shows what options were used to invoke your backend C compiler, nothing more.

spytheman avatar Mar 18 '23 16:03 spytheman

-cc msvc is weird (it has special cases and special bugs, that are not present for the rest of the backends), and so it can be 🤷🏻‍♂️ . I would suggest comparing the options that -showcc displayed more carefully.

spytheman avatar Mar 18 '23 16:03 spytheman

well in this example -showcc showed msvc being used: C compiler cmd: "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64\cl.exe"

which didn't seem to be the case

medvednikov avatar Mar 18 '23 16:03 medvednikov

image

These are the options, with -cc msvc and without it (using the default fallback for -prod). They definitely seem different to me 🤷🏻‍♂️ .

spytheman avatar Mar 18 '23 17:03 spytheman

Here is a clearer comparison: image

From that, I think that the left part (the one for -cc msvc) does pick conditional flags from vlib/builtin/builtin_d_gcboehm.c.v, that were inside $if msvc {, while the right one does not.

I think that the conditional compilation depends on evaluating msvc based on the passed compiler option, not on what exact compiler was chosen (due to -prod requiring not using the default tcc) by the builder.

Imho that is one more reason to dislike guessing, and the fact that currently -prod and -cc are not completely orthogonal. If they were, there would have been no such problems at all.

spytheman avatar Mar 18 '23 17:03 spytheman

Anyway, that is something for another issue. I think that the current one can be closed when the PR is merged, @rcsaquino ?

spytheman avatar Mar 18 '23 17:03 spytheman

Yes the initial issue has been fixed. The issue with -showcc can be discussed on another thread. Thanks for all those who helped especially @medvednikov and @spytheman

Cheers!

rcsaquino avatar Mar 18 '23 17:03 rcsaquino