github-workflows-kt
github-workflows-kt copied to clipboard
Improve and centralize YAML serialization
Currently the library contains logic for serializing to YAML in multiple places. Besides mixing layers (business logic vs serialization), this is error-prone because e.g. fixing an issue in one place may lead to not fixing it in other relevant contexts.
Here's a list of problems that appeared because of incorrect YAML serialization:
- #36
- #70
- #89 (confusion during development how to make the indent correct)
- #369
- #370
We should consider going back to kotlinx.serialization + kaml or equivalent, for the sake of delegating the problem of YAML synthesis. It's getting more and more tedious to generate YAML by hand. FYI @Vampire
After thinking a little, kaml doesn't seem to be the optimal choice for now. Here's why:
- it includes the
type
field to handle polymorphism which we don't actually need, but GitHub doesn't like it in the YAML - for custom arguments, I don't see a way to have e.g. some
data class
serialized, with a map ofString
toAny
added as extra properties
That's why I'm considering reimplementing the domain-to-YAML layer to use snakeyaml. It's used under the hood in kaml, and it's already used in serializing custom arguments. The idea is to have the whole domain object converted to maps, lists and primitive types, and have it serialized by snakeyaml. Thanks to this, the problem of creating a valid YAML will be delegated to the library, and this DSL library will get much simpler. The side-effect will be that YAML formatting will change a little.
FYI @jmfayard @Vampire - I'm open to having a discussion about the new approach to YAML emitting. It's quite a big effort, so while working on it, I'd like to have one or two acks that the idea makes sense :smile:
it includes the type field to handle polymorphism which we don't actually need, but GitHub doesn't like it in the YAML
Well, that's because polymorphism is not what you need or want here. Instead you want contextual serialization / deserialization. The kotlinx.serialization DAOs for GitHub actions an GitHub workflows I showed you are fully working to parse any possible GitHub action file and any possible GitHub workflow file including any possible documented variation.
And it also fully supports producing the according YAML that GitHub likes without any superfluous keys or values. Including using the minimal form possible, like having a scalar value like
on: push
a list like
on:
- push
- fork
or a map like a list like
on:
push:
branches: main
fork:
Only drawback of my current version is, that they are still for kotlinx.serialization 0.x, but it would probably not be too hard to port them over to the latest versions.
for custom arguments, I don't see a way to have e.g. some data class serialized, with a map of String to Any added as extra properties
Not fully sure what you mean. If you can build the DAO classes for the serialization, the serialization should work. And if the DAO classes support all valid cases, it should probably work just fine.
That's why I'm considering reimplementing the domain-to-YAML layer to use snakeyaml. It's used under the hood in kaml, and it's already used in serializing custom arguments. The idea is to have the whole domain object converted to maps, lists and primitive types, and have it serialized by snakeyaml. Thanks to this, the problem of creating a valid YAML will be delegated to the library, and this DSL library will get much simpler.
Well, if you prefer to reinvent the wheel, feel free to do so. Any solution that does not mean you do manual text concatenation should be better than the current situation. :-) My offer to refactor out my serializers into a stand-alone lib you can depend on and use is still standing.
Thanks! FTR, here are the DAO classes with custom serializers: https://github.com/Vampire/setup-wsl/tree/master/buildSrc/src/main/kotlin/net/kautler/dao I have to finally find time to wrap my head around it :smile:
Regarding supporting custom arguments, I'd like to make sure that once the client calls e. g. workflow
DSL builder with _customArguments = mapOf("new-workflow-argument" to 123)
, the library can emit this argument as any regular workflow's YAML attribute. With simple data classes without custom serializers, I don't see a way to achieve it.
The next step would be to prepare a PoC using kotlinx.serialization approach that covers this case. @Vampire if you would find a moment to prepare such example, it would be a great piece of help to me as it would bring us closer to the decision which way to go.
Of course we can always say that if the DAOs are mature enough, such feature is not needed. However I'd like to check out this case anyway, since I want to avoid deciding on removing library's feature prematurely. Even with most mature DAOs, GitHub can still add new attributes.
I did a quick and dirty version of Vampire's custom serializer in the script-generator. Might make sense to migrate them too, but first the library.
@jmfayard feel free to share it on a branch.
Regarding the idea of using snakeyaml, I prepared such PoC depicting what I mean by being able to define _customArguments
:
package it.krzeminski.githubactions.yaml
import org.yaml.snakeyaml.Yaml
// Domain class
data class Workflow(
val knownAttribute1: String,
val knownAttribute2: Map<String, List<String>>,
val attributeToOverwrite: String,
val _customArguments: Map<String, Any>,
)
fun main() {
val myWorkflow = Workflow(
knownAttribute1 = "hello",
knownAttribute2 = mapOf(
"foo" to listOf("bar", "baz"),
"yet" to listOf("another", "key"),
),
attributeToOverwrite = "overwrite-me",
_customArguments = mapOf(
"unknown-attribute1" to 123,
"unknown-attribute2" to listOf("foo", "bar", "baz"),
"attribute-to-overwrite" to "overwritten!",
),
)
val workflowAsYaml = myWorkflow.toYaml()
println(workflowAsYaml)
// Expected:
//
// known-attribute-1: hello
// known-attribute-2:
// foo: [bar, baz]
// yet: [another, key]
// attribute-to-overwrite: overwritten!
// unknown-attribute1: 123
// unknown-attribute2: [foo, bar, baz]
}
private fun Workflow.toYaml(): String {
val yamlable = mapOf(
"known-attribute-1" to knownAttribute1,
"known-attribute-2" to knownAttribute2,
"attribute-to-overwrite" to attributeToOverwrite,
) + _customArguments
return Yaml().dump(yamlable)
}
It would be cool to see a piece of code using kotlinx.serialization+kaml approach that would let us achieve the same.
https://github.com/sksamuel/hoplite
Hoplite also supports YAML, perhaps it's a viable approach
@LeoColman
Do you have capacity to prepare a short PoC like mine?
I think you can do a Workflow -> JsonElement
conversion, then a mapping from JsonElement -> JsonElement
using the values from _customArgument
, finally a JsonElement -> Kaml
@jmfayard JsonElement - typo? I don't quite get it, PoC please 🙏 😄 I'm also curious how it would differ from the snakeyaml-like approach.
sorry but I'm doing too much things at the moment
I'm gonna go the snakeyaml approach for now since
- it fulfills the current API's needs
- is simple to understand
- requires no extra dependencies
- will address several issues with the lib
- formatting of the output is in line with kotlinx.serialization (kx.s uses snakeyaml under the hood), so if we ever decide to go this way, formatting changes in the output YAML should be none or minimal
- gives fine-grained control on e.g. where to displays nulls explicitly, and where to display no property instead
Later on, we can always iteratively improve on it further.
Working on https://github.com/krzema12/github-actions-kotlin-dsl/issues/87 first, to better catch regressions that may result from changing the YAML emitting method.