fbuild
fbuild copied to clipboard
Embedded NULL failure
Building Felix:
I haven't seen this before, I have changed the build process a bit.
It looks like a bug in Python.. but why didn't it happen before?
Appveyor CI:
set PATH=C:\Python35-x64;c:\ocaml\bin;%PATH%
python --version
Python 3.5.4
etermining platform : {'windows', 'win32'}
looking for cl.exe : ok C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64\cl.exe
checking cl.exe : Traceback (most recent call last):
File "fbuild\fbuild-light", line 11, in
This was a Python bug that's been fixed in both CPython and Fbuild. I'll put the fix in Felix's copy later!
I tried the code in fbuild. That fails too! Using this:
self._as_parameter_ = LPCWSTR("\0".join(values).encode("utf-16le"))
I get
File "C:\projects\felix\fbuild\lib\fbuild\subprocess\winprocess.py", line 129, in __init__
self._as_parameter_ = LPCWSTR("\0".join(values).encode("utf-16le"))
TypeError: unicode string or integer address expected instead of bytes instance
NMAKE : fatal error U1077: 'C:\Python35-x64\python.EXE' : return code '0x1'
I also tried that latest fbuild as a whole, and that fails much earlier with "deprecated API". It was talking about the command line parsing. Not sure why a deprecated API would cause a bomb, but it did.
What bothers me is that this all worked until the latest major commit. See:
https://ci.appveyor.com/project/skaller/felix/history
The winprocess.py code has not changed. Interscript was removed. The configuration is no longer generated, a predefined config is now just copied. So there are some extra copies done by scanning certain subdirectories of src/config.
LPCWSTR("\0".join(values).encode("utf-16le"))
You mixed up the fix a bit... It should be LPCSTR, not LPCWSTR.
AppVeyor's Python likely got upgraded to a version that has the bug.
Ooops...:-) Thanks! Will report back. I'm using Python 3.5 on Appveyor.
Well with the fbuild version I now get this:
checking cl.exe : command failed: 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64\cl.exe' /nologo /Ox /IC:\projects\felix-e3h83\tmp\tmp4gqiagkb C:\projects\felix-e3h83\tmp\tmp4gqiagkb\temp.c
Traceback (most recent call last):
File "fbuild\fbuild-light", line 11, in <module>
main()
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\main.py", line 179, in main
result = build(ctx)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\main.py", line 104, in build
target.function(ctx)
File "C:\projects\felix-e3h83\fbuildroot.py", line 585, in build
phases = configure(ctx)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\__init__.py", line 121, in __call__
result, srcs, dsts = self.call(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\__init__.py", line 125, in call
return ctx.db.call(self.function, ctx, *args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\database.py", line 165, in call
call_result = function(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuildroot.py", line 558, in configure
build = config_build(ctx)
File "C:\projects\felix-e3h83\fbuildroot.py", line 328, in config_build
flags=ctx.options.build_c_flags),
File "C:\projects\felix-e3h83\fbuildroot.py", line 258, in make_c_builder
static=call('fbuild.builders.c.guess_static', ctx, *args, **kwargs),
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\functools.py", line 32, in call
return import_function(function)(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\__init__.py", line 457, in guess_static
), *args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\__init__.py", line 394, in _guess_builder
return fbuild.functools.call(function, ctx, exe, *args, **new_kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\functools.py", line 32, in call
return import_function(function)(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\msvc.py", line 531, in static
compiler=Compiler(ctx, Cl(ctx, **kwargs),
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\__init__.py", line 68, in __call__
*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\database.py", line 165, in call
call_result = function(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\db\__init__.py", line 64, in __call_super__
return super().__call__(*args, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\msvc.py", line 38, in __init__
if not self.check_flags(flags):
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\msvc.py", line 116, in check_flags
self([src], flags=flags, quieter=1, cwd=src.parent)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\builders\c\msvc.py", line 103, in __call__
return self.ctx.execute(cmd, msg2=msg2, **kwargs)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\context.py", line 234, in execute
raise e from e
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\context.py", line 214, in execute
**kwargs)
File "C:\Python35-x64\lib\subprocess.py", line 676, in __init__
restore_signals, start_new_session)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\subprocess\killableprocess.py", line 184, in _execute_child
startupinfo)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\subprocess\winprocess.py", line 159, in ErrCheckCreateProcess
ErrCheckBool(result, func, args)
File "C:\projects\felix-e3h83\fbuild\lib\fbuild\subprocess\winprocess.py", line 38, in ErrCheckBool
raise WinError()
OSError: [WinError 87] The parameter is incorrect.
NMAKE : fatal error U1077: 'C:\Python35-x64\python.EXE' : return code '0x1'
Stop.
Command exited with code 2
Help!! :)
@skaller Ahhhhh I hate Windows's Unicode handling. :angry:
Try this:
self._as_parameter_ = LPCSTR("\0\0".join(values).encode("utf-16le"))
Notice the change from \0
to \0\0
.
The problem is Python, not Windows :) Error checking text strictly is a stupid default. The default should be "it always works, even if the result is a bit nasty" because its just human readable script. Its still readable even if a few characters are wrong.
Well that fixed the problem I think but now there's another one:
https://ci.appveyor.com/project/skaller/felix/build/1.0.1638
configuring build phase
determining platform : {'win32', 'windows'}
looking for cl.exe : ok C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64\cl.exe
checking cl.exe : failed
looking for clang.exe : ok C:\Program Files\LLVM\bin\clang.exe
checking clang.exe : failed
looking for icc.exe : failed
looking for icc.cmd : failed
looking for icc.bat : failed
looking for icc : failed
looking for gcc.exe : failed
looking for gcc.cmd : failed
looking for gcc.bat : failed
looking for gcc : failed
looking for cc.exe : failed
looking for cc.cmd : failed
looking for cc.bat : failed
looking for cc : failed
cannot find a c static builder for frozenset({'gcc', 'icc', 'win32', 'windows', 'clang'})
NMAKE : fatal error U1077: 'C:\Python36-x64\python.EXE' : return code '0x1'
Stop.
The compiler found appears wrong .. it should be the full 64 bit compiler but isn't. That's VS 2017 not 2015 now. I would think Appveyor has the 64 bit compiler. I mean, the 32 bit compiler targeting 64 bits is OK, but a lot slower. Anyhow the test that it works is failing.
@skaller Can you have AppVeyor type build\host\fbuild.log
to print it (or wherever the path is for Felix because I can't remember)? Failing that I'll take a look at fbuild's appveyor and see if it shows anything of interest.
I'm not sure how to do that. However I can run on my Windows box. But I am guessing the problem is in the Python interface. Windows works JUST FINE with 8 bit data. Felix can do things in windows with popen emulation. Why the Python code tries to complex and irrelevant UTF-16 encode/decode crap i don't know.
I am guessing the "null" byte in the string is there because in fact the string is a bad attempt to create main's argv which is an array or null terminates strings.
OK so on my windows box:
checking cl.exe :
MainThread: starting "'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\cl.exe' /nologo /Ox /IC:\\Users\\skaller\\AppData\\Local\\Temp\\tmp4cb2dw00 C:\\Users\\skaller\\AppData\\Local\\Temp\\tmp4cb2dw00\\temp.c"
+ 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' /nologo /Ox /IC:\Users\skaller\AppData\Local\Temp\tmp4cb2dw00 C:\Users\skaller\AppData\Local\Temp\tmp4cb2dw00\temp.c
- exit 3221225781, 2.06 sec
Now, I tried this manually:
C:\cygwin64\home\skaller\felix>"C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\cl.exe"
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.23026 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
usage: cl [ option... ] filename... [ /link linkoption... ]
which shows cl.exe is there and it works fine.
The shown command with arguments does NOT work, the quoting is all wrong. If i fix the quoting i get "no such file or directory" for the C file which is expected, the temporary file is probably deleted automatically when the process ended.
So the problem is bad quoting.
So .. FYI in https://github.com/felix-lang/felix/blob/master/src/packages/program.fdoc you will find around line 1018 the rules for quoting on Windows for CMD.EXE. Roughly it is appears simple enough:
instance Shell_class[Win32, Win32Process::process_status_t]
{
fun quote_arg(s:string):string => '"' + s + '"';
fun quote_line_for_system(s:string) => '"' + s + '"';
fun quote_line_for_popen(s:string) => '"' + s + '"';
gen raw_get_stdout(x:string) = {
//eprintln("CMD.EXE: raw_get_stout of " + x);
var fout = Win32Process::popen_in(x);
if valid fout do
var output = load fout;
var result = Win32Process::pclose fout;
return Win32Process::int_of_process_status_t result, output;
else
println$ "Unable to run command '" + x "'";
return -1,"";
done
}
and the base type class says:
virtual fun quote_arg:string->string;
fun quote_args (s:list[string]) : string => catmap[string] ' ' quote_arg s;
In other words you just put "" around each argument, separate them by a space, and then quote the whole string again with "". If you have sloshes in the arguments DO NOT DOUBLE THEM UP. you do have to double up the sloshes in C or Felix string literals because the language lexer will replace them but not if you're constructing the strings.
Not sure if this helps. It took Shayne Fletcher and myself about 6 months working over Facebook with me on OSX and him on Windows to get this to actually work. It looks and is simple but it took a long time to figure out the rules. What you type at a CMD.EXE prompt is NOT the same as what you supply to system() or popen() the latter require the extra quote of the whole string. If you do not follow these rules, for example if you fail to quote the individual arguments, it won't work because if CMD.EXE called by system() or popen() sees a leading quote it throws it away. You need the arguments to remain quoted. It weird. But the code above is USED by Felix build system to call MSVC and it works.
BTW: on windows the problem is that programs do NOT get separate arguments as they do on Unix. On WIndows, which used to be MSDOS, programs get a single string and have to parse the string themselves.
What happens is in CMD.EXE, it finds the program name and calls it passing the WHOLE string to it. Now C programs requires main(argc, argv) so the startup code for C has the command line parsing it it. This ONLY applies to console programs! Similarly stdin/stdout are connected to the console in the startup code.
So when you call system() or popen() they start a CMD.EXE instance and pass to it the string argument WITH THE WRAPPER QUOTES REMOVED. So the string you use is not the same as when using the CMD.EXE prompt! The the C startup quote parses that to find the argv arguments, again stripping out the argument quotes if present. The tricky bit is that the whole string quote marks are removed, so if they're not there and the first argument is quoted .. the ARGUMENT quote is removed by mistake, and the end of argument quote then quotes the rest of the string by mistake and so it gets treated as a single argument!
Or something like that. Its a mess. The problem is Unix and C. What MSDOS did was actually more sensible. Actually if you glob commands to CMD.EXE it does nothing: the program has to decode the globs. In Unix shells the shell decodes the globs.
In Felix this is a right pain because I had to pick whether globs are unpacked or not because there is only ONE "system()" method and you really need two, not just for globs but other shell expansions. In Unix it depends on how you do the quoting whether the the shell expands the argument or not: double quoted arguments are expanded, single quoted ones aren't. Its a pain because of multiple layers of quote stripping. Its a pain in plain bash for the same reason. You have to calculate how many levels of (double up) quotes you need based on how many quote stripping passes you go through because you actually dispatch to the process you want (since the shell can contains, say, parens which spawn sub-shells ... ARGGGH) :)
I'm not sure how to do that. However I can run on my Windows box. But I am guessing the problem is in the Python interface. Windows works JUST FINE with 8 bit data. Felix can do things in windows with popen emulation. Why the Python code tries to complex and irrelevant UTF-16 encode/decode crap i don't know.
The Python code uses CreateProcessW (the wide-char) version. Not exactly sure why, but the original Mozilla killableprocess code was like that.
A quoting issue is actually odd. Python should properly quote it, and the quoting issue is probably just from printing it... The test tries compiling an actual C program. Could be an environment issue or something.
I. Really. Hate. Killableprocess.py.
The diagnostic when the compiler is invoked by hand says the file can't be found, because its a temporary already gone. In other words .. the command works fine when i use it manually with the correct quoting. So its not an environment issue uinless the process is started without inheriting the environment. The build system is run in a terminal with MSVC already set up.
The thing is .. the original code USED TO WORK. It just died somehow for no apparent reason .. I switched from VS2015 to VS 2017 in Appveyor but not on my local Windows box and it fails there the same way.
Ok so I fixed it.
Here's how: I DELETED fbuild's subprocess directory and changed the reference in context.py to just import the system subprocess.
Lo, it works. On OSX and on WIndows and hopefully Linux. The Appveyor build is running as I write.
It seems that the fbuild subprocess was added so that a broken version of Python would work. But it actually breaks the system on other versions of Python.