ktoml icon indicating copy to clipboard operation
ktoml copied to clipboard

General feedback on the API design

Open Olivki opened this issue 3 years ago • 13 comments

Hello, this isn't regarding one specific issue per se, but rather some general feedback regarding the design of the current API. If any of this comes off as aggressive/mean sounding, I apologize, my intention is solely for constructive criticism.

Most of my opinions will be based on the API design of officially supported format libraries developed by JetBrains themselves, which can be found here, and the Kotlin coding conventions provided by JetBrains, which can be found here. I'm not sure how much stuff you wanna change, but I figured it would be best to provide feedback while the library is still in early development, as a lot of these changes would break backwards compatibility.

There's a decent chunk of stuff that I want to provide feedback on, so I'm sorry if things read like a jumbled mess. I will try to section off the feedback to their own "sections" as best as I can.


If any of the suggestions here are something you like, I can make a pull requests with the fixes if desired. I would rather just explain my reasoning and thoughts before just making a pull requests with all fixes.


The Ktoml class

The class name

First point to address here is the name, if we look at naming rules, it states that Names of classes and objects start with an uppercase letter and use camel case, and with camel case, each new word should be capitalized, and for acronyms, each letter representing the word should normally be treated as a new word. meaning that if we follow these rules, the appropriate name for the class should be KToml rather than Ktoml as the K stands for Kotlin, and toml should be treated as one word. (By following the above rules it should technically be KTOML, but if we look at the officially supported formats like json, and other classes developed by JetBrains, they seem to follow the rules of Dart wherein an acronym that's 3 or more characters long should be treated as a word, so instead of URL it would be Url.)

However, if we look at essentially all other libraries, even those outside of the officially supported formats, like yamlkt and avro4k, they just use the format name as the class name, meaning that rather than Ktoml it would be Toml.

Personally I think the nicest looking option is to just follow the official libraries and name the class Toml, as there is no real point in denoting that it's specifically for Kotlin as far as I can see.

The general design of the config

I will be basing the following suggestion on the json library.

If we look at how the json library handles configuration, we can see that it's using a sealed class hierarchy to achieve this, which can be roughly laid out like this:

  • The Json class is the parent, you can not create new instances of directly, it has the implementations for the format its extending already defined.
  • The companion object Default of the Json class is the default implementation, which uses the default settings for serialization/deserialization.
  • There exists a JsonImpl class which allows custom settings to be set, this is internal and never exposed to the end user.
  • There exists a JsonBuilder class which allows the user to customize the settings, in conjunction with the top-level Json function this provides a nice Kotlin DSL for creating a Json format with custom settings.
  • The top-level Json function acts as the constructor of the Json class, allowing the user to create instances with a custom configuration easily.

The benefit of this structure is that if I just want to use the default settings for the format, I can just write Json.encodeToString, or Json.Default.encodeToString if I want to be more explicit. And when I want to change the settings I can just go Json { // stuff }. It also allows the user to easily copy the settings from any already created Json instance, while keeping the settings of the instance immutable.

If we apply the same design layout to ktoml it would roughly look like this:

public sealed class Toml(
    override val serializersModule: SerializersModule,
    public val ignoreUnknownNames: Boolean,
) : StringFormat {
    public companion object Default : Toml(EmptySerializersModule, false)

    public final override fun <T> encodeToString(serializer: SerializationStrategy<T>, value: T): String = // default implementation

    public final override fun <T> decodeFromString(deserializer: DeserializationStrategy<T>, string: String): T = // default implementation
}

public fun Toml(from: Toml = Toml.Default, builderAction: TomlBuilder.() -> Unit): Toml {
    val builder = TomlBuilder(from).apply(builderAction)
    return TomlImpl(builder.serializersModule, builder.ignoreUnknownNames)
}

public class TomlBuilder internal constructor(toml: Toml) {
    public var ignoreUnknownNames: Boolean = toml.ignoreUnknownNames

    public var serializersModule: SerializersModule = toml.serializersModule
}

private class TomlImpl(module: SerializersModule, ignoreUnknownNames: Boolean) : Toml(module, ignoreUnknownNames)

The deserialize and serialize top-level functions

This is partly down to personal preference, but the absence of anything similar from most libraries should also be a tell-tale sign.

I do not think having these top-level functions actually add anything of value, I can see that the thought behind being that it might be easier to just call the top-level function rather than having to create a new Ktoml instance and call the relevant function. However, I can only see that this would bring readability issues and ambiguity going down the line.

Here are some of the issues I can see would pop up from these functions:

  • The names of them are very ambiguous. Sure, it's obvious that they're deserializing/serializing something, but it's not obvious what format they're being converted to, deserializeToml would be better, but it still feels like a code smell due to the other reasons defined below.
  • From my own experience, it's much better to store/cache a kotlinx.serialization format as a constant value somewhere, as essentially all implementations are immutable and do not modify anything within itself, they can be used from multiple threads, so thread safety is not a concern. Therefore, creating a new instance every time you just wanna write/read something is a code smell, and should generally be avoided. Due to how these functions work, they all create a new instance just for this purpose.
  • Building on the first point, if I have multiple formats in one project, it's very ambiguous what format the function deserialize would actually deserialize into.
  • Unless something has changed, using the implicit serializer(). function like what is done in these functions is way slower than explicitly passing in a serializer, as it requires reflection rather than just a direct function call. So encouraging the use of that function by making these functions so easily accessible is not good design imo.

The dependency on okio

I personally think dragging in a whole dependency just for inbuilt support for reading from a file is rather excessive, and I know a lot of other people also would like the dependency graph of the libraries they use to be as minimal as possible.

The inbuilt functions for reading from a file aren't that much of a time-saver either: Ktoml.decodeFromFile(Thing.serializer(), "/foo/bar.toml") vs Ktoml.decodeFromString(Thing.serializer(), Path("/foo/bar.toml").readText()) (The above example is of course if you're on the JVM, but it's still relevant due to the argument below.)

There's also the fact that okio is not the only mulitplatform kotlin library that supports files, and while kotlinx.io is currently postponed, it will still be developed at one point, and there will certainly be more multiplatform file libraries developed. If this library then forces a dependency on okio this could be annoying for users who would rather use another library.

Therefore I think it would be better to not have explicit support for a specific file library, and rather just leave that up to the user. (Just quickly reading text from a file is more verbose in okio than in the Java path api with Kotlin extensions, but regardless, I don't think the minimal amount of boilerplate saved is worth explicitly forcing this library onto the user.)


These suggestions are mainly only for the public facing API, as I haven't looked too deeply into the more internal API.

I hope no offense was taken from this, this is only meant as constructive criticism for a library I'm looking forward to use once it gets more stable.

Olivki avatar Jun 22 '21 12:06 Olivki

Hi, this is an awesome feedback, wow! As I am trying to make everything as fast as possible - there are some issues that you have mentioned. I guess I will fix them all after I will finish the parser.

I would like to mention that some of our remarks are related to coding conventions and it will follow https://github.com/cqfn/diKTat/blob/master/info/guide/diktat-coding-convention.md convention after diktat will be integrated here

orchestr7 avatar Jun 25 '21 22:06 orchestr7

The only problem is with okio - I would really like people to have a simple method for reading Toml from file (as Toml is mostly used as a config), so I decided to use this one.

As I tested - it is the only stable library for reading files in Kotlin right now :)

orchestr7 avatar Jun 25 '21 22:06 orchestr7

Hi, this is an awesome feedback, wow! As I am trying to make everything as fast as possible - there are some issues that you have mentioned. I guess I will fix them all after I will finish the parser.

That's understandable, I figured that it was probably best to leave some feedback this early on in the project before it's gone stable, because changing large things could be very cumbersome or even impossible when a library becomes more stable.

I would like to mention that some of our remarks are related to coding conventions and it will follow https://github.com/cqfn/diKTat/blob/master/info/guide/diktat-coding-convention.md convention after diktat will be integrated here

The conventions defined by diktat are very sensible, and especially the naming conventions they define seem to be largely following what I expressed/tried to explain regarding the Ktoml name, so that seems good!


The only problem is with okio - I would really like people to have a simple method for reading Toml from file (as Toml is mostly used as a config), so I decided to use this one.

As I tested - it is the only stable library for reading files in Kotlin right now :)

I personally really like okio, and with kotlinx-io essentially being postponed indefinitely for now, it's probably the best choice for an io library.

Personally, I have no issue with an io library being a dependency, and having an easier way to read a toml file is a nice convenience, especially seeing as unlike formats like json, toml is made first and foremost to be a configuration language, so most of the time one would most likely be reading toml from a file.

The reason I brought it up as a concern is because I know there are a lot of people who dislike extra dependency without an extremely solid reasoning (aka; this is required or this thing won't work), hence why a lot of libraries proudly boost about having "zero external dependencies". So I thought it'd be apt to at least comment about it, as I know there may be users why will raise issues regarding it later down the line.

Olivki avatar Jul 04 '21 12:07 Olivki

✅ I have added diktat and detekt to the project and fixed all style-related issues. ✅ Updated names of API methods to deserializeToml, etc. Made these methods as String extension functions. ❌ Speaking about Ktoml name: as you can obviously understand, it is a game of words used all over the Kotlin world - same as in detekt, diktat and ktlint. I renamed Ktoml serialization class to Toml, but would not like to change anything else. ❌ I have not used the hierarchy of Json, as it looks very heavy for me right now :grin: We will have Toml().deserialize(string) and extension method string.deserialize() at least by for now. People who read Toml config file once - will be able to use string.deserialize() as it won't affect the performance. In case of some big number of Toml serialization - the will need to use the first way (to create a Toml() instance and call deserialization on it in Java-like way) :mushroom: Speaking about okio - I am still thinking about it and discussing with friends, everyone has it's own opinion, so let's keep it by for now. I just have mentioned that in the Readme

And I am still in progress with other notes you have mentioned :sweat_smile:

orchestr7 avatar Jul 10 '21 23:07 orchestr7

Sounds good! Just to clarify, I only meant one should use Toml for the class name, the actual project name I have nothing against, as I have myself used similar names for projects.

Olivki avatar Jul 12 '21 19:07 Olivki

@Olivki yes, I renamed that class :)

orchestr7 avatar Jul 12 '21 19:07 orchestr7

Hey @akuleshov7, nice work!

I checked some parts of the code and have a suggestion to improve the public API. I agree with @Olivki that top-level serialize and deserialize functions could bring some issues, but I'd suggest not declaring extension functions on String as it can pollute its namespace (i.e. autocomplete would always show these functions and String is an extremely common type). A nice alternative would be just Toml.deserialize("..."). It can be achived with an interface + companion object:

interface Toml {
    fun deserialize(value: String): TomlObject
    
    companion object DefaultInstance : Toml by TomlImplementation() {
        operator fun invoke(
            ignoreUnknownNames: Boolean
        ): Toml = TomlImplementation(
            ignoreUnknownNames = ignoreUnknownNames
        )
    }
}

internal class TomlImplementation(
    val ignoreUnknownNames: Boolean = false
) : Toml {
    override fun deserialize(value: String) = TODO()
}

// It's then possible to call both Toml.* functions directly for 
// using default options or constructing custom Toml instances 
// through "smart constructors" (= invoke operator fun)
fun main() {
    Toml.deserialize("Hello")
    Toml(ignoreUnknownNames = true).deserialize("Hello")
}

edrd-f avatar Jul 15 '21 22:07 edrd-f

@edrd-f @Olivki I can make the code look more close to kotlinx, what do you think about such API:

  • I will move file reading utils to a separate module
  • I will remove completely top-level functions and extentions
@ExperimentalSerializationApi
open class Toml(
    private val config: KtomlConf = KtomlConf(),
    override val serializersModule: SerializersModule = EmptySerializersModule
) : StringFormat {
    // parser is created once, to reduce the number of created classes for each toml
    val tomlParser = TomlParser(config)

    /**
     * The default instance of [Toml] with the default configuration.
     * see [KtomlConf] for the list of the default options
     *
     * ThreadLocal annotation is used for caching
     */
    @ThreadLocal
    public companion object Default : Toml(KtomlConf())

    // ================== basic overrides ===============
    /**
     * simple deserializer of a string in a toml format (separated by newlines)
     *
     * @param string - request-string in toml format with '\n' or '\r\n' separation
     * @return deserialized object of type T
     */
    override fun <T> decodeFromString(deserializer: DeserializationStrategy<T>, string: String): T {
        val parsedToml = tomlParser.parseString(string)
        return TomlDecoder.decode(deserializer, parsedToml, config)
    }

    override fun <T> encodeToString(serializer: SerializationStrategy<T>, value: T): String {
        TODO("Not yet implemented")
    }

    // ================== custom decoding methods ===============
    /**
     * partial deserializer of a string in a toml format (separated by newlines).
     * Will deserialize only the part presented under the tomlTableName table.
     * If such table is missing in he input - will throw an exception
     *
     * (!) Useful when you would like to deserialize only ONE table
     * and you do not want to reproduce whole object structure in the code
     *
     * @param deserializer deserialization strategy
     * @param toml request-string in toml format with '\n' or '\r\n' separation
     * @param tomlTableName fully qualified name of the toml table (it should be the full name -  a.b.c.d)
     * @return deserialized object of type T
     */
    fun <T> partiallyDecodeFromString(
        deserializer: DeserializationStrategy<T>,
        toml: String,
        tomlTableName: String
    ): T {
        val fakeFileNode = generateFakeTomlStructureForPartialParsing(toml, tomlTableName, TomlParser::parseString)
        return TomlDecoder.decode(deserializer, fakeFileNode, config)
    }
}

You would be able to make the following calls:

Toml.decodeFromString<MyClass>(myString)

(this is an extention method from kotlinx) THIS ONE OR without kotlinx:

Toml.decodeFromString<MyClass>(serializaer(), myString)

orchestr7 avatar Jul 16 '21 10:07 orchestr7

@akuleshov7 Looks great!

edrd-f avatar Jul 16 '21 12:07 edrd-f

I am also adding explicitApi() as a flag to the compiler. So we will need, for example, explicitly put public modifier to methods.

orchestr7 avatar Jul 16 '21 13:07 orchestr7

In https://github.com/akuleshov7/ktoml/pull/64 I had moved okio to a separate module. All the changes in API that had been discussed above were released in v0.2.7 (still not 0.3.0 as it is unstable)

orchestr7 avatar Aug 18 '21 11:08 orchestr7

Regarding the okio dependency, would it not have been better to keep it in the same module but have okio as a compileOnly dependency?

NightEule5 avatar Jan 02 '22 02:01 NightEule5

I'd prefer to use Toml instead of Ktoml in the actual place because other formats(Json, Yaml, NBT) behave like this.

Resolved.

Peanuuutz avatar Jan 11 '22 01:01 Peanuuutz