dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Generation of Data Schemas from Json could be better

Open Jolanrensen opened this issue 2 years ago • 1 comments

Let's say I have the following sample data from the Magical Location Clock app

{
  "0": {
    "name": "Kwijt 🤷",
    "people": [
      "Ronald",
      "Jolan"
    ],
    "peopleIds": [
      1544198659362550,
      1556728145307124
    ]
  },
  "164468523846839": {
    "name": "Joyce ❤️",
    "people": [],
    "peopleIds": [],
    "gps": {
      "latitude": 59.5391564418419,
      "longitude": 8.09191055862616
    }
  },
  "1644685267012957": {
    "name": "Joyce ❤️",
    "people": [],
    "peopleIds": [],
    "gps": {
      "latitude": 52.19508702169525,
      "longitude": 1.414078822875235
    }
  },
  "1659257255454626": {
    "name": "Camping 🏕️",
    "people": [
      "Pascale"
    ],
    "peopleIds": [
      1545419932562464
    ],
    "gps": {
      "latitude": 43.1857803898036,
      "longitude": 2.1369805421961723
    }
  }
}

Then the gradle plugin generates:

@DataSchema(isOpen = false)
interface Mlc1 {
    val name: String
    val people: kotlin.collections.List<kotlin.String>
    val peopleIds: kotlin.collections.List<kotlin.Long>
}

@DataSchema(isOpen = false)
interface Mlc3 {
    val latitude: Double
    val longitude: Double
}

@DataSchema(isOpen = false)
interface Mlc4

@DataSchema(isOpen = false)
interface Mlc2 {
    val gps: DataRow<Mlc3>
    val name: String
    val people: DataFrame<Mlc4>
    val peopleIds: DataFrame<Mlc4>
}

@DataSchema(isOpen = false)
interface Mlc5 : Mlc1 {
    val gps: DataRow<Mlc3>
}

@DataSchema
interface Mlc {
    @ColumnName("0")
    val `0`: DataRow<Mlc1>
    @ColumnName("164468523846839")
    val `164468523846839`: DataRow<Mlc2>
    @ColumnName("1644685267012957")
    val `1644685267012957`: DataRow<Mlc2>
    @ColumnName("1659257255454626")
    val `1659257255454626`: DataRow<Mlc5>
    public companion object {
      public const val defaultPath: kotlin.String = "src/main/resources/mlcTest.json"

      public fun readJson(path: kotlin.String = defaultPath, verify: kotlin.Boolean? = null): org.jetbrains.kotlinx.dataframe.DataFrame<Mlc> {
        val df = DataFrame.readJson(path, )
        return if (verify != null) df.cast(verify = verify) else df.cast()
      }
    }

}

Some things are curious here:

  • Mlc5 could be the same as Mlc1.
  • people generates either as an empty dataframe or a list (Not sure why not just a List<Nothing> or Any?)
  • lack of an attribute like gps could be treated by making it nullable

Alternatively, supporting something like OpenAPI https://github.com/Kotlin/dataframe/issues/142 could help prevent these issues.

Jolanrensen avatar Aug 08 '22 14:08 Jolanrensen

image So as it turns out, an empty json array is turned into an empty dataframe, while a non empty json array is turned into a normal list. Kinda odd isn't it?

Jolanrensen avatar Aug 12 '22 12:08 Jolanrensen

tackled in https://github.com/Kotlin/dataframe/pull/173

Jolanrensen avatar Oct 28 '22 11:10 Jolanrensen