simplecpp icon indicating copy to clipboard operation
simplecpp copied to clipboard

fix: always use absolute (and simplified) header path whenever possible

Open Tal500 opened this issue 1 year ago • 11 comments

This fix an issue of importing the same header by different addresses. The issue is trickier than seems, because if a user used the include dir flag -I... with an absolute path, then every matched header was detected to be absolute, but if the import was a relative import to the source file and the source file isn't with an absolute path, than the header is identified by the combined relative path.

See https://sourceforge.net/p/cppcheck/discussion/general/thread/c01f80556b/

Note - I tried to make the code compatible with Windows, but was tested only on linux.

Tal500 avatar Aug 06 '24 09:08 Tal500

This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

firewave avatar Aug 06 '24 12:08 firewave

This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

What about these (failed) tests? https://ci.appveyor.com/project/danmar/simplecpp/builds/50353854

Tal500 avatar Aug 06 '24 13:08 Tal500

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

sure, it would definitely make sense to include actual files in some python tests but as @Tal500 pointed out there are some tests for the handling at least.

What about these (failed) tests?

yes imho those are valuable also.

danmar avatar Aug 12 '24 07:08 danmar

@danmar if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?

Tal500 avatar Aug 14 '24 08:08 Tal500

if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?

yes that sounds probable to me.

I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?

danmar avatar Aug 16 '24 13:08 danmar

I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?

Yes, except for the system headers that were not found in the scan paths, e.g. STL headers (that simplecpp doesn't open them at all and it's not consider to be an error).

We could have a more complicated solution, that uses two header attributes:

  1. Name: The (first) included path, perhaps normalized
  2. Path: The absolute path that it is loaded from

The first attribute is mandatory, and the second one is optional and exists only when the header is found. That way, the error messages could still be looking the same as for today, but still correct.

An alternative solution is to have a user configuration that toggles the include path absoluteness.

Tal500 avatar Aug 18 '24 09:08 Tal500

We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution? Do you have a better idea?

Tal500 avatar Sep 04 '24 14:09 Tal500

We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution

I am very sorry for the delay. I was at cppcon last week and had a presentation.

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

danmar avatar Sep 28 '24 09:09 danmar

@Tal500 just making sure you see my comment.

danmar avatar Sep 28 '24 09:09 danmar

I am very sorry for the delay. I was at cppcon last week and had a presentation.

I'd like too see it when it will be published!

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

I'll look into that more

Tal500 avatar Sep 29 '24 10:09 Tal500

I am very sorry for the delay. I was at cppcon last week and had a presentation.

I'd like too see it when it will be published!

:+1:

for your information here is a short description: https://cppcon2024.sched.com/event/1gZdy/building-cppcheck-what-we-learned-from-17-years-of-development

I guess it will take a while until it is published.

danmar avatar Sep 29 '24 12:09 danmar

We could have a more complicated solution, that uses two header attributes: @danmar what do you think about this solution

I am very sorry for the delay. I was at cppcon last week and had a presentation.

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

I went on a different solution - just be conservative and check whether the relative version of the path or the absolute version of the path was already loaded (in case that the header file was found).

If the header is included as a system header, the default expansion is to absolute path. If the header is included as a relative manner (by "") - it just simplifies the path - and do not change its path absoluteness (relative paths keep being relative, absolute paths keep being absolute).

I think that this is the sane solution for this problem. I run the tests on Linux for both simplecpp and cppcheck, and they all passed eventually.

Tal500 avatar Nov 17 '24 10:11 Tal500

I think that this is the sane solution for this problem.

ok I didn't fully dig into your solution but I will trust you for now.

I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.

danmar avatar Nov 20 '24 16:11 danmar

I think that this is the sane solution for this problem.

ok I didn't fully dig into your solution but I will trust you for now.

I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?). Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

Tal500 avatar Nov 20 '24 18:11 Tal500

Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).

Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py

There are several tests in that file that creates testfiles using tmpdir..

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.

danmar avatar Nov 21 '24 15:11 danmar

Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).

Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py

There are several tests in that file that creates testfiles using tmpdir..

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.

Tested locally, here are my added tests (with helper functions), append them to https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py to see the results:

def __test_relative_header_create_header(dir, with_pragma_once=True):
    header_file = os.path.join(dir, 'test.h')
    with open(header_file, 'wt') as f:
        f.write(f"""
                {"#pragma once" if with_pragma_once else ""}
                #ifndef TEST_H_INCLUDED
                #define TEST_H_INCLUDED
                #else
                #error header_was_already_included
                #endif
                """)
    return header_file, "error: #error header_was_already_included"

def __test_relative_header_format_include_path_arg(include_path):
    return f"-I{json.dumps(str(include_path))[1:-1]}"

def __test_relative_header_format_include(include, is_sys_header=False):
    if is_sys_header:
        return f"<{json.dumps(include)[1:-1]}>"
    else:
        return json.dumps(include)

def __test_relative_header_create_source(dir, include1, include2, is_include1_sys=False, is_include2_sys=False):
    src_file = os.path.join(dir, 'test.c')
    with open(src_file, 'wt') as f:
        f.write(f"""
                #undef TEST_H_INCLUDED
                #include {__test_relative_header_format_include(include1, is_include1_sys)}
                #include {__test_relative_header_format_include(include2, is_include2_sys)}
                """)
    return src_file


def test_relative_header_0_rel(tmpdir):
    _, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert double_include_error in stderr

def test_relative_header_0_sys(tmpdir):
    _, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]

    _, _, stderr = cppcheck(args)
    assert double_include_error in stderr

def test_relative_header_1_rel(tmpdir):
    __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_1_sys(tmpdir):
    __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]

    _, _, stderr = cppcheck(args)
    assert stderr == ''

## TODO: the following tests should pass after applying simplecpp#362

def test_relative_header_2(tmpdir):
    header_file, _ = __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", header_file)

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_3(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, "test_subdir/test.h", header_file)

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_3_inv(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, header_file, "test_subdir/test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''


def test_relative_header_4(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, header_file, "test2.h", is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''



def test_relative_header_5(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test_subdir/test.h", is_include1_sys=True)

    args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''


Tal500 avatar Dec 01 '24 18:12 Tal500

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

I don't have strong opinion. But you could add integration_test.py or simplecpp_test.py maybe in the root folder. The test would use the simplecpp binary rather than cppcheck.

def __test_relative_header_format_include_path_arg(include_path): return f"-I{json.dumps(str(include_path))[1:-1]}"

I don't understand this. The -I does not expect json.

danmar avatar Dec 05 '24 14:12 danmar

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

Will do that, I need also to copy some verification of infra functions that I had in cppcheck pytest files.

def __test_relative_header_format_include_path_arg(include_path): return f"-I{json.dumps(str(include_path))[1:-1]}"

I don't understand this. The -I does not expect json.

It's a common trick to extract escaped string to a string in many language to use JSON serialization for a string, and in Python it is just json.dumps(s). It's equivalent to the C++ std::quoted(). Notice that JSON serialization is more general than just quoting a string (i.e. also numbers booleans arrays and dictionaries), but this is the natural function for that. Notice also that I'm also trimming to the Python range [1:-1] in order to eliminate the trailing " from the begin and the end of the quoted string.

BTW: I'm not sure that I should have escaped the strings with the "-I..." flag, since this argument is passed directly to the process arguments, and I'm not sure if the string escapes characters (i.e. /) are getting processed.

Tal500 avatar Dec 05 '24 15:12 Tal500

ok I was also confused by the {} but oops now I understand those are because it's a format string. I spontanously don't think you'll need to escape but feel free to try..

danmar avatar Dec 06 '24 15:12 danmar

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

I don't have strong opinion. But you could add integration_test.py or simplecpp_test.py maybe in the root folder. The test would use the simplecpp binary rather than cppcheck.

Done. Note the needed gitignore that for some reason you didn't notice

Tal500 avatar Dec 10 '24 16:12 Tal500

Thanks.. test looks good imho. I have two more small comments:

  • I would like that there is an issue for this.
  • please run the test in a github action

danmar avatar Dec 12 '24 10:12 danmar

Thanks.. test looks good imho. I have two more small comments:

* I would like that there is an issue for this.

* please run the test in a github action

Done

Tal500 avatar Dec 15 '24 17:12 Tal500

sorry for delayed review. please refactor testing.

Done

Tal500 avatar Dec 22 '24 10:12 Tal500

@Tal500 Thank you!

danmar avatar Jan 01 '25 21:01 danmar