mbedtls icon indicating copy to clipboard operation
mbedtls copied to clipboard

Standardise Project Detection in Shell Scripts

Open tom-daubney-arm opened this issue 1 year ago • 1 comments

Description

Fixes #8524

This PR introduces a more robust way for shell scripts to establish which repository they are being called from, rather than relying on project directory structure as was the case before.

PR checklist

Please tick as appropriate and edit the reasons (e.g.: "backport: not needed because this is a new feature")

  • [x] changelog not required - Not user visible
  • [ ] 3.6 backport ???
  • [x] 2.28 backport not required - Enhancement
  • [x] tests not required - Sufficient tests already in place

Notes for the submitter

Please refer to the contributing guidelines, especially the checklist for PR contributors.

tom-daubney-arm avatar Dec 05 '23 12:12 tom-daubney-arm

There's a merge conflict now unfortunately.

gilles-peskine-arm avatar Mar 14 '24 15:03 gilles-peskine-arm

For the reviewers: See also comment in 9394.

ronald-cron-arm avatar Jul 23 '24 08:07 ronald-cron-arm

What's there is looking good to me, I like the approach and I'm happy with its execution.

I'm just wondering about the intended scope, as I can see two scripts adapted to the new way, but grep -- '-d.*library' **/*.sh finds other scripts that seem to test for "well-known" directories. I didn't check if they did that for the purpose of detecting Mbed TLS vs TF-PSA-Crypto, but I'd like to confirm the intended scope here so that I can check the PR covers everything it's supposed to.

(Sorry if this should be obvious from context, I'm a bit new to this.)

It makes sense to me to generalize the usage of the way of detecting the project introduced in this PR in a follow-up PR. As mentioned here it would probably be beneficial to have a small .sh file containing the detection logic and for tests scripts to just include it.

ronald-cron-arm avatar Jul 23 '24 13:07 ronald-cron-arm

  • 3.6 backport ???

I'd think we want it: IIUC the point of this is to support scripts that are shared between the two projects, which means they should be moved to the framework repo eventually, which means 3.6 will take them from the framework as well.

So, we might not strictly need the script modifications in 3.6 now, as we can argue they will be picked from the framework eventually, but it seems to me we'll need the scripts/project_name.txt file in 3.6. And it's probably better to add it at the same time as we add it in development, and then while at it we might as well backport the script modifications?

mpg avatar Jul 24 '24 08:07 mpg

Back to scope questions: while at it, shouldn't we put the function definitions in a separate file, say scripts/detect_project.sh and then each script would just source that, instead of having the definitions duplicated all over the place? Especially now that they're becoming less trivial with the added check.

Perhaps the definition of PROJECT_NAME_FILE will need to remain in each script though, as the path will be relative to the current working directory, which may not be the project's root for each script? Or do we have an official rule that scripts must be executed from the project's root?

mpg avatar Jul 29 '24 10:07 mpg

Back to scope questions: while at it, shouldn't we put the function definitions in a separate file, say scripts/detect_project.sh and then each script would just source that, instead of having the definitions duplicated all over the place? Especially now that they're becoming less trivial with the added check.

Perhaps the definition of PROJECT_NAME_FILE will need to remain in each script though, as the path will be relative to the current working directory, which may not be the project's root for each script? Or do we have an official rule that scripts must be executed from the project's root?

I agree and was going to suggest this, but am also wondering if that should be tracked with an issue and done in a follow up PR? Do you have an opinion here @ronald-cron-arm?

I don't know of an official rule that states we run all scripts from project root but I can't think of a time where I have run a script that was not done from the root. Thus, I would say it is probably ok to include the definition of PROJECT_NAME_FILE in the script we source from, and then if there are any exceptions we could just redefine the variable after sourcing the script in the exceptional cases. What do you guys think?

tom-daubney-arm avatar Jul 29 '24 10:07 tom-daubney-arm

Thanks for addressing my feedback.

I've been thinking a bit more about this "working directory" issue, and the way the functions are written, the constraint is not really that scripts need to be invoked from the project's root, it's that the working directory must be the project's root when the functions are invoked. This might cause issues when the script (temporarily) changes to another working directory, and then tries to call one of those functions.

I'm tempted to suggest we change strategy and do the check only once at startup:

PROJECT_NAME_FILE='./scripts/project_name.txt'
if read -r PROJECT_NAME; then :; else
    echo "$PROJECT_NAME_FILE does not exist... Exiting..." >&2
    exit 1
fi

in_mbedtls_repo () {
    test "$PROJECT_NAME" = "Mbed TLS"
}

Wdyt?

I agree and was going to suggest this, but am also wondering if that should be tracked with an issue and done in a follow up PR?

I'd be fine with that.

mpg avatar Jul 30 '24 08:07 mpg

I'm tempted to suggest we change strategy and do the check only once at startup:

PROJECT_NAME_FILE='./scripts/project_name.txt'
if read -r PROJECT_NAME; then :; else
    echo "$PROJECT_NAME_FILE does not exist... Exiting..." >&2
    exit 1
fi

in_mbedtls_repo () {
    test "$PROJECT_NAME" = "Mbed TLS"
}

I think that makes sense as a strategy. I am curious as to why that necessitates a change to the way the file is read etc. Is this approach more robust?

Also, do you mind explaining the above snippet a little? For example it reads to me like the contents of the file is being read into the PROJECT_NAME variable but I am not really understanding how the read command knows to open the PROJECT_NAME_FILE?

tom-daubney-arm avatar Jul 30 '24 09:07 tom-daubney-arm

I am curious as to why that necessitates a change to the way the file is read etc. Is this approach more robust?

I don't think it requires changing the way we read the file. But it allows changing it to rely less on external commands (such as grep) and more on built-ins (which read and test are), which I tend to do out of habit even though it really doesn't matter here.

(For a case where it matters for performance, see https://github.com/Mbed-TLS/mbedtls/pull/3615 if you're curious. But here I don't think performance should matter.)

Also, do you mind explaining the above snippet a little? For example it reads to me like the contents of the file is being read into the PROJECT_NAME variable but I am not really understanding how the read command knows to open the PROJECT_NAME_FILE?

That's an excellent question, it doesn't :) I meant to write read -r PROJECT_NAME < "$PROJECT_NAME_FILE", oops ^^

mpg avatar Jul 30 '24 10:07 mpg

I've been thinking a bit more about this "working directory" issue, and the way the functions are written, the constraint is not really that scripts need to be invoked from the project's root, it's that the working directory must be the project's root when the functions are invoked. This might cause issues when the script (temporarily) changes to another working directory, and then tries to call one of those functions.

Changing to another directory can simplify a script, but you have to be careful. If the script takes arguments that are file names, those file names should be interpreted relative to the current directory at the time the script was invoked. Consuming an argument later requires making the file name absolute before changing directories. Making file names absolute is not always a good thing though: if you write the file name somewhere, that locks the location down and then you can't move the build tree anymore.

rely less on external commands (such as grep) and more on built-ins (which read and test are), which I tend to do out of habit even though it really doesn't matter here.

As a rule of thumb, use built-ins when processing a small amount of data (e.g. here reading a single line), but external commands when processing a large amount of data at once (not inside a loop: the external command is the loop).

gilles-peskine-arm avatar Jul 30 '24 10:07 gilles-peskine-arm

Changing to another directory can simplify a script, but you have to be careful.

I don't disagree, what I'm saying is that changing to another directory is something our scripts currently do: egrep '\<cd\>' **/*.sh finds 47 matches. Some of them are clearly not a problem, but I haven't investigated if some of them are likely to end up having us call in_xxx_repo while in a directory that's not the project's root.

Now, reading again the PR, clearly the old tests assumed that the working directory was the project's root when the functions were invoked, so presumably it should be OK for the new tests to make the same assumption. I might have been getting ahead of myself, over-anticipating possible issues.

mpg avatar Jul 30 '24 10:07 mpg

@mpg I think we should be there now. PTAL.

tom-daubney-arm avatar Jul 30 '24 14:07 tom-daubney-arm

  • tests not required - Sufficient tests already in place

Changes to all.sh are somewhat tested by the PR job; I've checked that the number of never executed tests is not up compared to development. But lcov.sh is not tested by the CI at all. Can you give it a run locally just to ensure we haven't completely broken it? (Just as a matter or principle, I don't think we're broken it.)

mpg avatar Jul 31 '24 08:07 mpg

Capturing coverage data from library
Found gcov version: 13.1.0
Using intermediate gcov format
Scanning library for .gcno files ...
Found 107 graph files in library
Processing lmots.gcno
/home/<user_name>/repos/main/standardise_proj_detection/library/lmots.gcno:version 'A94*', prefer 'B31*'
geninfo: ERROR: GCOV failed for /home/<user_name>/repos/main/standardise_proj_detection/library/lmots.gcno!

That is the output from lcov.sh. It has an error but I don't think that is anything to do with the file checking. What do you think @mpg ?

tom-daubney-arm avatar Jul 31 '24 08:07 tom-daubney-arm

Looks unrelated indeed, probably wrong version of something. Actually we can test in on the CI using the release job (need to be logged in to access that page) - just the basic_build_test.sh part, that calls make lcov which in turns invokes lcov.sh. Can you start such a job (build with parameters -> adjust repo and branch) and add a link in the PR's description?

mpg avatar Jul 31 '24 08:07 mpg

@mpg, the CI tests seem to have run successfully now. Does this mean you are happy with this PR?

tom-daubney-arm avatar Jul 31 '24 10:07 tom-daubney-arm

I am. Let's see if @gabor-mezei-arm agrees now :)

mpg avatar Jul 31 '24 11:07 mpg

@mpg and @gabor-mezei-arm ; the 3.6 backport is live: #9437

tom-daubney-arm avatar Jul 31 '24 15:07 tom-daubney-arm

I agree and was going to suggest this, but am also wondering if that should be tracked with an issue and done in a follow up PR?

I'd be fine with that.

@mpg I have created issue #9452 to track this.

tom-daubney-arm avatar Aug 06 '24 14:08 tom-daubney-arm