scautable icon indicating copy to clipboard operation
scautable copied to clipboard

Auto-magically infer types?

Open Quafadas opened this issue 11 months ago • 4 comments

A few prior steps;

  • summary reporting on type conversions
  • then code generation
  • generate case class code
  • finally a macro

Quafadas avatar Mar 04 '25 15:03 Quafadas

  1. Summary reporting on type conversions First: add functionality to analyze CSV data and report what types each column could be converted to This would help users understand their data
  2. Code generation Generate code that would handle the appropriate type conversions
  3. Generate case class code Automatically create Scala case classes that match the inferred structure This would allow users to work with properly typed data objects instead of tuples
  4. Finally a macro Implement this functionality as a Scala macro for compile-time processing

Current approach all string val row = csvData.next() // (col1: String, col2: String, col3: String) val sum = row.col1.toInt + row.col2.toInt

Actual provide like pandas experience val row = csvData.next() // (col1: Int, col2: Int, col3: String) val sum = row.col1 + row.col2 // no conversion needed

LSUDOKO avatar Mar 16 '25 21:03 LSUDOKO

I would like to work on this project can you assign me ?

LSUDOKO avatar Mar 16 '25 21:03 LSUDOKO

Feel free to submit a PR - there's no-one really looking at it. I should warn you that I don't think this is particularly easy. I would doubt it's vibecode-able.

Quafadas avatar Mar 17 '25 06:03 Quafadas

You can ping me on discord if you have questions, I'm usually found hanging around the scala forums, same handle as GitHub.

Quafadas avatar Mar 17 '25 06:03 Quafadas

Some inspiration in this post on discord I believe on a potential first cut of how this could be implemented.

https://discord.com/channels/632150470000902164/875868146949554207/1379794983795818518

Quafadas avatar Jun 04 '25 16:06 Quafadas

More ideas; https://discord.com/channels/632150470000902164/875868146949554207/1385296719255830760

import scala.quoted.*

case class FieldDescription(name: String, typeName: String)

case class NamedTupleElement(labelName: String, valueType: Type[?])
object NamedTupleElement:
    def fromFieldDescription(fieldDescription: Expr[FieldDescription])(using Quotes) =
        import quotes.reflect.*
        fieldDescription match
            case '{ FieldDescription(${ Expr(name) }, ${ Expr(typeName) }) } =>
                NamedTupleElement(name, mapFieldType(typeName))
            case _ => report.errorAndAbort("Fields must be known at compile time.")

private def mapFieldType(typeName: String)(using Quotes) =
    import quotes.reflect.*
    typeName match
        case "String" => Type.of[String]
        case "Int" => Type.of[Int]
        case _ => report.errorAndAbort(s"Unsupported type for field type: $typeName")

private def typeReprFromType(tpe: Type[?])(using Quotes) =
    import quotes.reflect.*
    tpe match
        case '[t] => TypeRepr.of[t]

transparent inline def makeNamedTupleBuilderVarArg(inline fields: FieldDescription*) =
    ${ makeNamedTupleBuilderVarArgImpl('fields) }

private def makeNamedTupleBuilderVarArgImpl(fieldsExpr: Expr[Seq[FieldDescription]])(using Quotes) =
    import quotes.reflect.*

    // Extract compile-time known values from each FieldDescription
    // to get a sequence of NamedTupleElements each holding a label name (String) and value type (Type[?]).
    val namedTupleElements = fieldsExpr match
        case Varargs(fieldDescriptionExprSeq) =>
            fieldDescriptionExprSeq.map:
                NamedTupleElement.fromFieldDescription

    makeNamedTupleBuilderImpl(namedTupleElements)

private def makeNamedTupleBuilderImpl(namedTupleElements: Seq[NamedTupleElement])(using Quotes) =
    import quotes.reflect.*

    // Get the TypeRepr of `scala.*:` which is the cons type for Tuples / HList in Scala3
    val tupleConsTypeRepr = TypeRepr.of[scala.*:]

    // Construct a TypeRepr representing the named-tuple labels.
    // Fold list of label names of type `String`, `List(l0, l1, ...)`
    // to a tuple of the corresponding `ConstantType` elements `Tuple[lt0, lt1, ...]`
    // using the `*:` type class:
    //   lt0 *: lt1 *: ... *: EmptyTuple
    val labelsTypeRepr = namedTupleElements.view
        .map(_.labelName)
        .foldRight(TypeRepr.of[EmptyTuple]): (labelName, acc) =>
            val labelType = ConstantType(StringConstant(labelName))
            tupleConsTypeRepr.appliedTo(List(labelType, acc))

    // Construct a List[TypeRepr[?]] representing the types of the values in the named tuple.
    val fieldTypeReprs = namedTupleElements.view
        .map(namedTupleElement => typeReprFromType(namedTupleElement.valueType))
        .toList

    // Fold list of TypeRepr List(t0, t1, ..., tn) to Tuple[t0, t1, ..., tn] type using the `*:` type class:
    //   t0 *: t1 *: ... *: EmptyTuple
    val valuesTupleTypeRepr = fieldTypeReprs.foldRight(TypeRepr.of[EmptyTuple]): (tpe, acc) =>
        tupleConsTypeRepr.appliedTo(List(tpe, acc))

    // Create parameter names v0, v1, ... for the lambda.
    val paramNames = (0 to fieldTypeReprs.length).map(i => s"v$i").toList

    // Construct type of named tuple we want to return including the types of the values.
    // (label0: t0, label1: t1, ..., labeln: tn)
    val namedTupleTypeRepr = TypeRepr.of[NamedTuple.NamedTuple].appliedTo(List(labelsTypeRepr, valuesTupleTypeRepr))


    // Construct the type of the lambda:
    //   (v0: t0, v1: t1, ..., vn: tn) => (label0: t0, label1: t1, ..., labeln: tn)
    val funcType = MethodType(paramNames)(
        _ => fieldTypeReprs,
        _ => namedTupleTypeRepr
    )

    (labelsTypeRepr.asType, namedTupleTypeRepr.asType) match
        case ('[labelsType], '[namedTupleType]) =>
            val lambda = Lambda(
                Symbol.spliceOwner,
                funcType,
                (owner, params) => {
                    // Convert argument list into a tuple.
                    val valuesAsTuple = Expr.ofTupleFromSeq(params.map(_.asExpr))

                    // Build the named tuple.
                    // Cast the result to the correct type or we get (label0: Any, ...).
                    val namedTupleExpr = '{
                        NamedTuple.build[labelsType & Tuple]()(${ valuesAsTuple })
                            .asInstanceOf[namedTupleType]
                    }

                    // Return the body of the lambda
                    namedTupleExpr.asTerm
                }
            )

            // Emit the lamda as the result of our macro.
            lambda.asExpr

        case _ =>
            report.errorAndAbort("Unexpected error matching on types.")

Quafadas avatar Jun 20 '25 07:06 Quafadas

#69

Quafadas avatar Aug 15 '25 11:08 Quafadas