fix: always use absolute (and simplified) header path whenever possible
This fix an issue of importing the same header by different addresses.
The issue is trickier than seems, because if a user used the include dir flag -I... with an absolute path, then every matched header was detected to be absolute, but if the import was a relative import to the source file and the source file isn't with an absolute path, than the header is identified by the combined relative path.
See https://sourceforge.net/p/cppcheck/discussion/general/thread/c01f80556b/
Note - I tried to make the code compatible with Windows, but was tested only on linux.
This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).
This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?
This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).
This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?
What about these (failed) tests? https://ci.appveyor.com/project/danmar/simplecpp/builds/50353854
This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?
sure, it would definitely make sense to include actual files in some python tests but as @Tal500 pointed out there are some tests for the handling at least.
What about these (failed) tests?
yes imho those are valuable also.
@danmar if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?
if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?
yes that sounds probable to me.
I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?
I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?
Yes, except for the system headers that were not found in the scan paths, e.g. STL headers (that simplecpp doesn't open them at all and it's not consider to be an error).
We could have a more complicated solution, that uses two header attributes:
- Name: The (first) included path, perhaps normalized
- Path: The absolute path that it is loaded from
The first attribute is mandatory, and the second one is optional and exists only when the header is found. That way, the error messages could still be looking the same as for today, but still correct.
An alternative solution is to have a user configuration that toggles the include path absoluteness.
We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution? Do you have a better idea?
We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution
I am very sorry for the delay. I was at cppcon last week and had a presentation.
As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.
@Tal500 just making sure you see my comment.
I am very sorry for the delay. I was at cppcon last week and had a presentation.
I'd like too see it when it will be published!
As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.
I'll look into that more
I am very sorry for the delay. I was at cppcon last week and had a presentation.
I'd like too see it when it will be published!
:+1:
for your information here is a short description: https://cppcon2024.sched.com/event/1gZdy/building-cppcheck-what-we-learned-from-17-years-of-development
I guess it will take a while until it is published.
We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution
I am very sorry for the delay. I was at cppcon last week and had a presentation.
As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.
I went on a different solution - just be conservative and check whether the relative version of the path or the absolute version of the path was already loaded (in case that the header file was found).
If the header is included as a system header, the default expansion is to absolute path. If the header is included as a relative manner (by "") - it just simplifies the path - and do not change its path absoluteness (relative paths keep being relative, absolute paths keep being absolute).
I think that this is the sane solution for this problem.
I run the tests on Linux for both simplecpp and cppcheck, and they all passed eventually.
I think that this is the sane solution for this problem.
ok I didn't fully dig into your solution but I will trust you for now.
I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.
I think that this is the sane solution for this problem.
ok I didn't fully dig into your solution but I will trust you for now.
I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.
I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?). Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?
In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?
Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?
If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.
I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).
Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py
There are several tests in that file that creates testfiles using tmpdir..
In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?
more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.
Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?
If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.
I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).
Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py
There are several tests in that file that creates testfiles using tmpdir..
In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?
more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.
Tested locally, here are my added tests (with helper functions), append them to https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py to see the results:
def __test_relative_header_create_header(dir, with_pragma_once=True):
header_file = os.path.join(dir, 'test.h')
with open(header_file, 'wt') as f:
f.write(f"""
{"#pragma once" if with_pragma_once else ""}
#ifndef TEST_H_INCLUDED
#define TEST_H_INCLUDED
#else
#error header_was_already_included
#endif
""")
return header_file, "error: #error header_was_already_included"
def __test_relative_header_format_include_path_arg(include_path):
return f"-I{json.dumps(str(include_path))[1:-1]}"
def __test_relative_header_format_include(include, is_sys_header=False):
if is_sys_header:
return f"<{json.dumps(include)[1:-1]}>"
else:
return json.dumps(include)
def __test_relative_header_create_source(dir, include1, include2, is_include1_sys=False, is_include2_sys=False):
src_file = os.path.join(dir, 'test.c')
with open(src_file, 'wt') as f:
f.write(f"""
#undef TEST_H_INCLUDED
#include {__test_relative_header_format_include(include1, is_include1_sys)}
#include {__test_relative_header_format_include(include2, is_include2_sys)}
""")
return src_file
def test_relative_header_0_rel(tmpdir):
_, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)
test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")
args = [test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert double_include_error in stderr
def test_relative_header_0_sys(tmpdir):
_, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)
test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)
args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]
_, _, stderr = cppcheck(args)
assert double_include_error in stderr
def test_relative_header_1_rel(tmpdir):
__test_relative_header_create_header(tmpdir)
test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")
args = [test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
def test_relative_header_1_sys(tmpdir):
__test_relative_header_create_header(tmpdir)
test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)
args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]
_, _, stderr = cppcheck(args)
assert stderr == ''
## TODO: the following tests should pass after applying simplecpp#362
def test_relative_header_2(tmpdir):
header_file, _ = __test_relative_header_create_header(tmpdir)
test_file = __test_relative_header_create_source(tmpdir, "test.h", header_file)
args = [test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
def test_relative_header_3(tmpdir):
test_subdir = os.path.join(tmpdir, "test_subdir")
os.mkdir(test_subdir)
header_file, _ = __test_relative_header_create_header(test_subdir)
test_file = __test_relative_header_create_source(tmpdir, "test_subdir/test.h", header_file)
args = [test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
def test_relative_header_3_inv(tmpdir):
test_subdir = os.path.join(tmpdir, "test_subdir")
os.mkdir(test_subdir)
header_file, _ = __test_relative_header_create_header(test_subdir)
test_file = __test_relative_header_create_source(tmpdir, header_file, "test_subdir/test.h")
args = [test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
def test_relative_header_4(tmpdir):
test_subdir = os.path.join(tmpdir, "test_subdir")
os.mkdir(test_subdir)
header_file, _ = __test_relative_header_create_header(test_subdir)
test_file = __test_relative_header_create_source(tmpdir, header_file, "test2.h", is_include2_sys=True)
args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
def test_relative_header_5(tmpdir):
test_subdir = os.path.join(tmpdir, "test_subdir")
os.mkdir(test_subdir)
__test_relative_header_create_header(test_subdir)
test_file = __test_relative_header_create_source(tmpdir, "test.h", "test_subdir/test.h", is_include1_sys=True)
args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]
_, _, stderr = cppcheck(args, cwd=tmpdir)
assert stderr == ''
I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.
I don't have strong opinion. But you could add integration_test.py or simplecpp_test.py maybe in the root folder. The test would use the simplecpp binary rather than cppcheck.
def __test_relative_header_format_include_path_arg(include_path): return f"-I{json.dumps(str(include_path))[1:-1]}"
I don't understand this. The -I does not expect json.
I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.
Will do that, I need also to copy some verification of infra functions that I had in cppcheck pytest files.
def __test_relative_header_format_include_path_arg(include_path): return f"-I{json.dumps(str(include_path))[1:-1]}"
I don't understand this. The -I does not expect json.
It's a common trick to extract escaped string to a string in many language to use JSON serialization for a string, and in Python it is just json.dumps(s). It's equivalent to the C++ std::quoted(). Notice that JSON serialization is more general than just quoting a string (i.e. also numbers booleans arrays and dictionaries), but this is the natural function for that. Notice also that I'm also trimming to the Python range [1:-1] in order to eliminate the trailing " from the begin and the end of the quoted string.
BTW: I'm not sure that I should have escaped the strings with the "-I..." flag, since this argument is passed directly to the process arguments, and I'm not sure if the string escapes characters (i.e. /) are getting processed.
ok I was also confused by the {} but oops now I understand those are because it's a format string. I spontanously don't think you'll need to escape but feel free to try..
I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.
I don't have strong opinion. But you could add
integration_test.pyorsimplecpp_test.pymaybe in the root folder. The test would use the simplecpp binary rather than cppcheck.
Done. Note the needed gitignore that for some reason you didn't notice
Thanks.. test looks good imho. I have two more small comments:
- I would like that there is an issue for this.
- please run the test in a github action
Thanks.. test looks good imho. I have two more small comments:
* I would like that there is an issue for this. * please run the test in a github action
Done
sorry for delayed review. please refactor testing.
Done
@Tal500 Thank you!