RFC: a type to attempt parsing
In my (internal) codebase, I've just written this type:
-- | Try parsing a value. If it fails, record the error message.
newtype Try a = Try { tryValue :: Either String a }
deriving (Eq, Ord, Show, Read, Functor)
-- | Always succeeds.
instance (FromField a) => FromField (Try a) where
parseField = return . Try . runParser . parseField
The intent is for fields that may not match a specified format; e.g. the documentation for FromField has an example of a Color type. In practice, if this is in the middle of a row, then parsing that row fails, and thus parsing an entire CSV file can fail.
Whilst the type can be wrapped with a Maybe, this has two unfortunate side-effects:
- Cannot distinguish between "Empty cell" and "Failed to parse cell"
- No error message available to report back as to why a row may be bad
As such, this wrapper type allows you to successfully parse a file, and then discard rows which did not actually succeed (logging errors, etc.).
If we add in the attempted field into the Left case, we can then use that for ToField and thus have the round-trip property. We can similarly also apply this to the FromRow/ToRow case to be able to discard rows that cannot succeed (especially useful for streaming scenarios).
Do you think this is a useful enough type for me to send a PR? What should the name be?
How is this different from just using Either Text (or Either String). The default instance are already written and I think are identical to yours.
There is an Either Field a instance which keeps the field value, but not the error message. When I wrote that type I needed the error message to be able to report why a row was skipped.
I'm a bit confused. Your Try is at the column level not row level, so I'm not sure which error message you get at the row level .
You can make a row level instance of it as well.
On Wed, 5 Jul. 2017, 11:37 pm Max, [email protected] wrote:
I'm a bit confused. Your Try is at the column level not row level, so I'm not sure which error message you get at the row level .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hvr/cassava/issues/146#issuecomment-313105240, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwXc8ML5OYZ-HXkytAWXrazeCuGbhrTks5sK5GhgaJpZM4OHgS4 .
--
Ivan Miljenovic On mobile; please excuse any tpyos
What I mean, I have similar problems and the way I do to have row parametrized by a functor
data Product f = Product { name :: f String, price :: f Double }
What I parse (and declare as a FromRecord instance) is Product (Either Text). Then I have a validate function Product (Either Text) -> Either (Product (Either Text)) (Product Identity) which returns a valid product or an invalid one, then I can display if needed the all the rows (valid or not) with the original value and error message. I'm not sure how your approach is different ?
My use-case was for having fields that were meant to have specific textual values (hence why I referenced the example Color example) but were at times incorrect; alternatively, they would need to be in a specific format (such as a date encoded as an 8-digit integer) that I need to parse. However, I wanted to distinguish between a Maybe because the column was empty and a Maybe because the value was invalid.
I wasn't aware of the Either Field a instance or I might have used that; but what I actually wanted was to get the actual error message and display that as I processed the CSV file (whilst continuing on with the rest of the file rather than erroring), so if I used Either I'd have to re-parse the (invalid) Field again just to get that message.
So in practice I have a bunch of Try (Maybe Foo) in my datatype that parses the CSV file, then a function to convert from that to a datatype that contains what I actually want which just has Maybe Foo; that function is of type CSVType -> Either ErrorType UsedType where ErrorType is a richer consistent error type that I use throughout my program that contains the error message, unique key from the record, etc.
TL;DR: I want error messages; to use Either Field a I would have to re-parse them, and there is no corresponding Either String a for that (that could be used instead of wrapping it up, but it means that e.g. #116 couldn't be implemented).