[Feature] OpenAPI/Swagger JSON type schema support
As described by https://github.com/Kotlin/dataframe/issues/142
This PR contains a custom OpenAPI -> DF Marker conversion and implementation in the Gradle- and KSP plugin. There are also a lot of tests, but I'm not yet 100% confident I've caught every edge case (there are many...). Docs are updated too. Let me know if there are any major things (or minor ones, all are welcome) that need changing or if you have some more testing ideas.
okay, I looked over additionalProperties... https://swagger.io/docs/specification/data-models/dictionaries/ Let's see how I can add that
Since I changed so much I'm gonna provide a small overview of the changes I made. Might make it easier to review. I'll go through the changed files:
- Linting changes (sorry, but improves readability)
- Docs for OpenAPI and jsonoptions
- replaced all occurrences of "splitted" with "split" because English
- updated kotlinDatetime and made it api() for jupyter etc.
- ImportDataSchema gained jsonOptions with typeClashTactic and keyValuePaths
- starting with kdoc in some functions (more will come later)
- convertTo
convertIfin dsl: way more powerful thanconvert(KType, KType), allows you to specify via a condition function whether you want to do a certain conversion and provides fromType and toSchema in ConverterScope.convertstill has priority overconvertIf- convertTo can now happen for any empty df (both 0 rows or 0 columns)
- manual converters can now happen for any type of target column, not just value columns. user conversion happens before df conversion making some conversions that were previously impossible possible.
- Value columns of datarows can become column groups, same as value columns with all nulls.
- value columns of dataframes can become frame columns
- absent columns can be created iff it's a nullable typed value column, DataRow<Something?> for a group column or frame column.
- containsNoData() helper function (created that once but don't use it anymore, might still be useful? opinion needed)
- enums can implement
DataSchemaEnumto control how they are (en/de)coded from/to datasets instead of using just their name - code generation can now generate enums and type aliases too
- if a generated interface contains a reference to another
@DataSchema, the type will no longer be wrapped in aDataRowsince that unneccesary. DefaultReadDfMethodnow provides actualMarkerinstead of just a name- nullability helpers for FieldType
- bugfix for
MarkersExtractorregardingnullableProperties containsinBaseColumnis nowoperator fun- Bugfix for concat of DataFrames without columns (rows must still count up, else schema conversions will break)
- String+Number can become Comparable, this was always the case but order dependent. That now always gives the same result thanks to a bugfix in
commonParents()andcommonType() - ISO_DATE_TIME support in parser from string
CodeGenerator.Companion.urlReaderis now split intourlDfReaderandurlCodeGenReaderwhere the former is used to generate a dataframe from the data and types from that, while the latter directly generates types (for openapi)- Similar split for
SupportedFormatintoSupportedDataFrameFormatandSupportedCodeGenerationFormat createColumn()- bugfix for empty iterables always making column groups. Now uses guesstype or creates value column
- bugfix for iterables with just nulls always making a frame column, now also uses guesstype or creates value column
- can now create Column group with iterable of datarows too
- small fix in printing newlines of data schemas
createEmptyColumnandcreateEmptyDataFramevariants withnumberOfRowsColumnSchema, aside fromtypenow also hascontentTypefor extra type info for Group- and Frame columns (instead of just gettingDataRow<*>orDataFrame<*>. Very useful for newconvertIfmethod.intersectSchemaalso tries to merge thesecontentTypes if possible.extractSchemawill make themAny?.toSnakeCase()helper function- json reader:
- distinction in guess.kt between normal json and openapi json
- typeClashTactics ANY_COLUMNS and ARRAY_AND_VALUE_COLUMNS (original and default)
- keyValuePaths where using given
JsonPath, the objects will be read asDataFrame<KeyValueProperty>
- openapi:
- can decode json/yaml openapi 3.0 type specifications and produce
CodeWithConverterwith allDataSchemainterfaces, type aliases and enums, as well asreadJsonfunctions which automatically fill in keyValuePaths and other conversions necessary. - objects with just
additionalPropertieswill become key/value dataframes. - objects with
properties&additionalPropertieswill ignoreadditionalProperties
- can decode json/yaml openapi 3.0 type specifications and produce
importDataSchema()function for Jupyter forSupportedCodeGenerationFormats like openapi- DSL for
JupyterConfiguration - bugfix for resolution overload ambiguity in
DataFrame.get(vararg IntRage) - examples for openapi and keyvalue, both in jupyter and normal kt files
- JsonOptions in KSP and Gradle plugin
- tests Edit:
- Values in columns are now only "listified" if
suggestedTypeis given as list, not automatically anymore. Also adds optionallistifyValuesargument toguessValueType()andbaseType()