openhab-core [Semantics] Refactor generated/sorted files to go under `target/generated-sources`

Forked Issue from Original Thread

Probably they should be. I think that is so for other cases where code-gen is used. I believe the convention is to save such files in the target/generated-sources directory rather than in src/main, the pom is adapted so it can pull in those file dependencies. So I think you would have the original SemanticTags.csv under src/main and the generated/sorted version of it under target/generated-sources.

This is best kept as a separate issue.

Originally posted by @jimtng in https://github.com/openhab/openhab-core/issues/4700#issuecomment-2780710905

Background

This issue is for discussion and feedback from OH Core Maintainers about how to maintain generated source files.

The OH Core semantics module depends on a CSV file called SemanticTags.csv which is the basis for the contents of all semantic tags. The module also contains a groovy script which parses this CSV file and does the following:

Sorts the CSV file semantically by tag type, parent and name. The purpose is that additions can be made in any random place within the file, and the groovy script ensures a once more logical overview
Generates the Java source code files DefaultSemanticTagProvider.java, DefaultSemanticTags.java
Generates the tags.properties file for i18n translations
Generates the thing-description-1.0.0.xsd xml schema validation file

Currently when the groovy script is run it stores the above-mentioned files in the following locations..

CSV file is sorted and overwrites itself
Java files under src/main/java
Properties under src/main/java/resources
XSD file in o.o.c.thing/schema

Maven is already configured to run the groovy script during the code generation phase, and the new versions of the files are stored and used in subsequent phases of the Maven build. However the groovy script can also be run manually locally.

There are a number of problems here..

It is not clear if the newly overwritten files are available on GitHub (or if their prior versions are there). In other words it is not clear if it is necessary to run an offline build and open a new PR to force the new versions of the files onto GitHub.
Therefore, if the script is run in CI Build it might produce different results than if someone downloads from GitHub and builds locally
As a general rule such generated files should probably be saved under target/generated-sources rather than src/main/java

So the question to OH maintainers is if we should

Keep the current architecture; possible disadvantages are a) that it requires a new PR each time to run the groovy script locally and push the modified files, and b) if only the SemanticTags.csv is changed, then CI builds might produce different results than expected.
Change the architecture so that only the SemanticTags.csv needs to be changed, and all the other code generation is automatic; possible disadvantage is that generated files will not be explicitly visible locally unless a local maven build is done.

Question

Question to OH Core maintainers: => WDYT?

Apr 05 '25 13:04 andrewfg

If this were to be done, SemanticTags.csv should remain as is, i.e. don't keep two versions of it (one unsorted, one sorted). We want to just keep it sorted and think of the sorting as a form of "spotless" task.

Apr 05 '25 13:04 jimtng

@jimtng I modified the top post on this thread to explain the background for OH Core Maintainers

Apr 05 '25 14:04 andrewfg

The simplest thing to do is to just include running the groovy script in the normal build process so it happens during user build too. There are warnings within the generated source code so people would know not to edit them manually.

It is currently not done to save from running it unnecessarily when SemanticTags.csv isn't modified. Imagine you're wanting to build the whole oh core, working on unrelated parts. Not running it would save a few extra milliseconds of the overall build time.

What I would suggest is this:

Keep the current workflow as is and not add more target/generated-sources. IMO this type of stuff makes things more complicated and should be avoided if possible.
Add a comment inside SemanticTags.csv:

# Run `mvn -Pgenerate-tag-classes install` after modifying this file, and commit the generated files

Make the groovy script ignore comments This is easy enough and I can do a quick PR to do it.

The only caveat: Github currently doesn't support comments in csv, so we need to cheat a bit:

The benefit of the current workflow is that we can see the resulting changes to SemanticTags.csv directly on the generated files. It is simple enough to follow, unlike the xtend stuff where we do utilise the "generated-sources" system.

Apr 06 '25 02:04 jimtng

It is currently not done to save from running it unnecessarily when SemanticTags.csv isn't modified.

Are you sure of that statement? As mentioned before, there is a an entry in the pom for the semantics model which AFAICT does run the script every time the build is run. Or?

https://github.com/openhab/openhab-core/blob/693943e7c1f7d96ed882ddfaa08083f331452561/bundles/org.openhab.core.semantics/pom.xml#L64

Apr 06 '25 07:04 andrewfg

You can try it yourself:

Check out the latest commit
Make changes to SemanticTags.csv
Run a normal build with mvn clean install -pl :org.openhab.core.semantics

See whether the dependent files got generated (they don't).

Apr 06 '25 07:04 jimtng

openhab-core/bundles/org.openhab.core.semantics/pom.xml

Line 64 in 693943e

This one is executed when you specifically run mvn with -Pgenerate-tag-classes

Apr 06 '25 07:04 jimtng

So we just need to modify the pom so that that profile is part of the default build? (I wouldn't worry about the performance hit which would be trivial..)

Apr 06 '25 09:04 andrewfg

openhab-core openhab-core copied to clipboard

[Semantics] Refactor generated/sorted files to go under `target/generated-sources`

Forked Issue from Original Thread

Background

Question

openhab-core
openhab-core copied to clipboard