ungoogled-chromium Do not perform domain substitutions within comments

Covers C/C++, Java, Javascript and Python file extensions

This change will prevent domain substitutions to happen within comments or comments blocks, thus reducing the amount of unnecessary file changes when applying domain substitution.

This is what I had in mind for issue #849.

May 05 '20 07:05 csagan5

I think there's potential for additional processing in the future.

As a general rule I avoid adding code for the "just in case" situation, but if you have some use-cases to leverage already then sure - why not.

For starters, the regex for the non-(Python and GN) files don't apply to all the selected file types.

What do you mean here? I am genuinely curious about some bug I might not have spotted.

Since domain substitution is an essential component, I'd rather play it safe and implement AST parsing.

That would indeed be a very advanced and complete solution.

May 31 '20 17:05 csagan5

As a general rule I avoid adding code for the "just in case" situation, but if you have some use-cases to leverage already then sure - why not.

I think the AndroidManifest schema problem is another such example. Also, anything in DOMAIN_EXCLUDE_PREFIXES could have a smarter exclusion rule associated with them.

What do you mean here? I am genuinely curious about some bug I might not have spotted.

Actually, I was mistaken about this. I didn't realize all those languages supported both of those commenting types.

However, I also realized your code will break if the comment delimiter appears in strings. I don't know if any substitutions could be affected by this, though.

May 31 '20 22:05 Eloston

However, I also realized your code will break if the comment delimiter appears in strings. I don't know if any substitutions could be affected by this, though.

Yes, I am aware, I decided to go with this risk. I have inspected the produced diff (now much smaller thanks to the code in this PR) a few times (even if not thoroughly) and it seemed correct.

May 31 '20 23:05 csagan5

What issue is tried to be solved by not performing domain substitution on comments? If it's the timestamps, wasn't that taken care of in https://github.com/Eloston/ungoogled-chromium/commit/1a0e163a1130c7b47b8a98fc387628b077320a49?

Sep 19 '20 22:09 Luka-Filipovic

@Luka-Filipovic I create git commits after performing the domain substitutions and there are several MBs of unwanted changes; I also think that changing the timestamp alone won't prevent the file from being rebuilt if one if its comments changed.

Sep 19 '20 22:09 csagan5

@Luka-Filipovic I create git commits after performing the domain substitutions and there are several MBs of unwanted changes;

What percentage of files from lists have links only in comments? Not doing domsub in test files could also help with your problem, and it can be made only by parsing the file name.

When comments and test files are cut out, what could be the next step in improving this process?

I also think that changing the timestamp alone won't prevent the file from being rebuilt if one if its comments changed.

If that's the case (I didn't test it myself yet), would https://github.com/Eloston/ungoogled-chromium/issues/1185 help the problem?

Sep 19 '20 22:09 Luka-Filipovic

Not doing domsub in test files could also help with your problem, and it can be made only by parsing the file name.

I already covered this by excluding all test files/directories I could identify.

What percentage of files from lists have links only in comments?

Can't say, but if you make a git diff you will not be able to easily identify the actual code changes because the comment changes drown them out.

If that's the case (I didn't test it myself yet), would #1185 help the problem?

ninja uses timestamps like make, there's an open ninja issue about using file characteristics.

The problem is when you build Chromium and then run domain substitution: setting all files to a precedent timestamp to trick ninja into not rebuilding those files is wrong, as you cannot tell if only comments or code were changed. Thus what will happen is that there will be no advantage in building Chromium and then your modified version with patches and domain substitution, as most files will be rebuilt because of the changes. ccache mitigates this to some extent, but it still leaves room for improvement; with the patch I proposed here I made this process of re-using unchanged object files more efficient.

As for #1185 I actually think it's better to apply domain substitution after the other patches, so that they can be used individually without dependencies on the big domain substitution patch (which I treat as a patch, not different from any other).

Sep 20 '20 09:09 csagan5

The problem is when you build Chromium and then run domain substitution: setting all files to a precedent timestamp to trick ninja into not rebuilding those files is wrong, as you cannot tell if only comments or code were changed. Thus what will happen is that there will be no advantage in building Chromium and then your modified version with patches and domain substitution, as most files will be rebuilt because of the changes. ccache mitigates this to some extent, but it still leaves room for improvement; with the patch I proposed here I made this process of re-using unchanged object files more efficient.

That's interesting. What use-case do you have where you'd want to build Chromium once first, apply domain substitution, then re-run the build? I envisioned only these two workflows for domain substitution:

Full build: Domain substitution, then build.
Incremental build:
1. Apply domain substitution
2. Build
3. Revert domain substitution
4. Make code changes
5. Repeat steps 1-4 until the the desired build is created.

Of these two workflows, the timestamp manipulation logic is useful only for the incremental build workflow.

Sep 27 '20 22:09 Eloston

That's interesting. What use-case do you have where you'd want to build Chromium once first, apply domain substitution, then re-run the build?

The Bromite build process is a bit like that:

build the vanilla Chromium (only some small patches applied, and the GN args of course): https://www.bromite.org/chromium
then build Bromite with all the other patches, including the domain substitution

The Chromium builds are important because before a bug is submitted for Bromite it needs to be tested against vanilla Chromium first (yes, we often incur in upstream bugs which become visible downstream because of specific feature/GN flags combinations).

In general you want to make the best out of ninja, C++, Java own invalidation systems, and ccache's one; the changes in the comments paddle against all of them as at least the parser must run to find out that nothing changed.

Sep 28 '20 17:09 csagan5

ungoogled-chromium ungoogled-chromium copied to clipboard

Do not perform domain substitutions within comments

ungoogled-chromium
ungoogled-chromium copied to clipboard