kantan.csv
kantan.csv copied to clipboard
Case insensitive headers
We have a requirement that sees us needing to support customers "creative" decisions regarding casing for headers. Seems like I can't do this out of the box, but I should be able to hack together a new version of a HeaderDecoder
that does this correctly. I think this could actually be generalized within the determineRowMappings
function by simply adding an equality function String => String => Boolean
. Then when doing the index check simply replace val index = csvHeader.indexOf(header)
with:
val index = csvHeader.indexWhere(fn(header))
Could go even further potentially and define an equality typeclass that can be derived from an Ordering
(i.e. a <> b).
I'd like to submit a PR to this effect, but wanted to check on the preferred approach (assuming you'd accept the change). I can either update the existing generated decoders to support another parameter def decoder[A1: CellDecoder, R](f1: String)(f: (A1) => R): HeaderDecoder[R]
becomes def decoder[A1: CellDecoder, R](f1: String, eq: (String, String) => Boolean = defaultEquality)(f: (A1) => R): HeaderDecoder[R]
, or I can create another set of generated methods that are called def decoderWith
, and using the mixin method GeneratedHeaderDecoders1
to avoid bin-compat issues.
That sounds great!
I think the Ordering
type class would be overkill:
- painful if it's "just" to specify a different way of comparing strings
- requires a fair amount of rewiring to support any type as header label
Not that it wouldn't be an interesting feature to have, but it's only vaguely correlated with what you're trying to achieve - it would work roughly the same way and re-use most of the same code, but what you're trying to achieve is case-insensitivity, not arbitrary types. I wouldn't mind such a PR, but I'd argue that we should focus on solving your actual problem first.
A potential second PR would almost be trivial, once the groundwork is laid: replace String
with A
, replace the eq
function with A => A => Boolean
, write a new Eq
type class (for which default implementations could be provided from types that have an instance of Ordering
), and have new generated methods that rely on it to provide an eq
function to the "core" implementations.
I don't particularly try to maintain bin compat, and will always gleefully break it if supporting it would mean writing "worse" code than I otherwise would. What I'll insist on is source compat though - what compiled before must compile now, unless a very strong argument is made.
Ok so for reference this is what I ended up with for our case:
private def determineRowMappings(requiredHeader: Seq[String], csvHeader: Seq[String])(
eq: (String, String) => Boolean
): DecodeResult[Seq[Int]] = {
@tailrec
def loop(missing: List[String], found: List[Int], required: List[String]): DecodeResult[Seq[Int]] = required match {
case head :: rest =>
val index = csvHeader.indexWhere(eq(head, _))
if (index < 0) loop(head :: missing, found, rest)
else loop(missing, index :: found, rest)
case Nil if missing.nonEmpty =>
DecodeResult.typeError(s"Missing header(s): ${missing.reverse.mkString(", ")}")
case Nil =>
DecodeResult.success(found.reverse)
}
loop(List.empty, List.empty, requiredHeader.toList)
}
implicit class EnrichedHeaderDecoder(val value: HeaderDecoder.type) extends AnyVal {
def decoderWith[A1: CellDecoder, A2: CellDecoder, A3: CellDecoder, R](f1: String, f2: String, f3: String)(
equal: (String, String) => Boolean
)(
f: (A1, A2, A3) => R
): HeaderDecoder[R] = new HeaderDecoder[R] {
def fromHeader(header: Seq[String]): DecodeResult[RowDecoder[R]] =
determineRowMappings(List(f1, f2, f3), header)(equal).map(mapping =>
RowDecoder.decoder(mapping(0), mapping(1), mapping(2))(f)
)
def noHeader: RowDecoder[R] = RowDecoder.ordered(f)
}
}