go-ipld-prime icon indicating copy to clipboard operation
go-ipld-prime copied to clipboard

Review of traversal and "not found" errors

Open warpfork opened this issue 4 years ago • 1 comments

There should be a consistent and well-documented approach to errors around "not found" values, both when encountered by the traversal package, and where they arise from single-step moves across nodes (e.g. via the LookupBy{String|Index|Node|etc}() methods).

(Currently, this is not the case: the traversal package has untyped errors formed by string concatenation (yuck), and in individual nodes, I've done what "seems right" on a case-by-case basis, hoping that a single coherent picture would emerge after doing this for a while. (It has not.) Direct effort is now needed to clean this up and coherentify it.)

Part of the process of solving this involves deciding how many discrete "not found" concepts there are, and which of these are worth discrete error types (or if we should make a single error type with an enum field describing which kind of "not found" this one is, as a detail; or etc).

This issue is just going to enumerate scenarios I've discovered as relevant. Not all of them were obvious until I started writing code that encountered them, so this list is useful to gather in one place. (It's a long list -- sorry, reader. This is "exploration report"-style content moreso than problem-solving, so far, so it's inclusive, rather than simplifying.)

In rough order of increasing complexity of scenario:

  • there are "regular" "not found" errors from maps: looking up a key that simply isn't in them.
  • there are "regular" "not found" errors from lists: looking up an index that is beyond their size.
  • there may be "not found" errors from lists which are from negative indexes: does this deserve its own type?
  • there may be "not found" errors from lists that aren't purely from size: someone may be able to implement a "sparse list" via an ADL.
  • there are "will never be found" errors when using schemas:
    • struct fields that don't exist
      • there's currently a schema.ErrNoSuchField type for this, but it's possible that was a mistake and should be removed.
      • there's also an ipld.ErrInvalidKey type used for this. Inconsistently, vs the above, apparently :(
    • union members that don't exist (both at the type level, for all unions; and at the representation level, for keyed unions)
      • there's currently an ipld.ErrInvalidKey type which is used here.
      • there's comments which consider adding a schema.ErrInvalidUnionDiscriminant type for this, but it's possible that wouldn't help as much as it confuses.
      • another option could be to consider this an instance of schema.ErrEnumNotMatched? (Such an error type is also not yet present.)
  • there are absent values, which are "found" but are... well, absent.
    • only appears when schemas are in use, and at the type-level view
      • struct fields which are tagged optional can be in this state.
      • note that unions do not use absent for their non-present members. (that would imply that for their iterators to be logically consistent, the iterators would yield more than one entry... which does not generally feel helpful.)
    • for the most part this is settled now:
      • if a value is absent, it must not return a "not found" error; if a "not found" error is returned, the value must be nil and must not be 'absent'.
      • still worth keeping in your pool of considerations when thinking about (and documenting) the rest.
  • there are "not found" errors when using LookupBySegment on a map when the PathSegment is an int, or in the other direction, when a string segment is used on a list node.
    • ... or are these "wrong kind" errors? (This would be interesting, because it's the only time a "wrong kind" error would appear depending on an argument of the function rather than on the kind of the method receiver.)
  • what should be the expected behavior with LookupByNode? Should it attempt to coerce things? Or return "not found" errors? Or return a different type of error which points out the need to convert things manually?
    • several variations of this question:
      • on a plain map with string keys: should we try to coerce any argument to a string?
      • on a struct: should that do the same as a map?
      • on a typed map with complex keys: if we're given a string, should we try to process it into the complex key?
      • on a typed map with complex keys: what if we're given something that is neither the key type directly nor a string kind, has a string representation?
      • on a typed map with complex keys: what if we're given something that's not the key type, but happens to be structurally isomorphic (i.e. is also acts as a map at type-level, and has the same keys as the typed map's key type has fields)?
    • there's a review-needed comment in codegen about this (compare that behavior in maps to the current behavior in structs to see a concrete example of the question).
    • may be interesting to compare this to the "wrong kind" question about LookupBySegment, too: except "wrong kind" definitely wouldn't suffice here, since things can be right kind but wrong type with LookupByNode.
  • there are the "not found" errors returned by the traversal package: which are caused by one of the above, but also should report the path where the error was encountered
    • presumably these wrap the above errors
      • these may also need to include "wrong kind" errors as their cause -- how else would we describe the reason the traversal halted if the user asked for a long set of path segments, and at some point midway through we landed on a string instead of a map?
    • possible that these should also include other info from traversal.Progress like LastLink
    • ...or, should we just make room for this kind of info as optional fields in the core "not found" error type, and avoid needing a new type for traversal errors?

... and that should be about exhaustive.

We can try to break this down in the comments. In general I suspect this should probably collapse down to as few error types as possible, but haven't determined how exactly that should best be done.

warpfork avatar Sep 05 '20 13:09 warpfork

To try to answer some of the questions posed by this, my current thinking is:

  • Should the "will never be found" errors have a distinct type from "regular" "not found" errors?
    • :arrow_right: No. Cases where distinct types are used for this right now have been a mistake.
      • The increase in error handling complexity vs the payoff in explanation-to-user has not been cost-effective.
      • The theory that "will never be found" errors should be statically unreachable in some instances of sufficiently well-written code... doesn't really seem to come to a lot of useful impact. (If you're writing generic code that is not certain about the data content, these errors just increase your error handling complexity; and if you're writing code that is certain about the content, you A) might be using codegen, in which case you can cast to concretes and use methods that can talk to the compiler about the lack of these errors in the first place, or B) even the most basic unit tests should probably uncover the kinds of typos this would help against... and you'd need that much exercise to hit these anyway, regardless of whether the errors have distinctive types, so the utility of distinct types that demand distinctive handling is dubious at best.)
  • Should there be an enum for the various "will never be found" reasons, or should more types be used for this?
    • :arrow_right: We already voted against "more types", in the previous point, so "no" to that idea here as well.
    • :arrow_right: Yes, an enum sounds like it might work nicely for this. Making that enum a field of a single unified ErrNotFound type sounds like a reasonable way forward. (This would probably do better for "cost-effective explanatory" as mentioned in previous point.)
  • Should the ErrNotFound type have an optional spot for Path info, so that we don't need a separate type for traversal to use?
    • :arrow_right: Not sure. Very receptive more opinions on this.
      • There's no strong incentive for the traversal package not to introduce its own type that (consistently) wraps all this.
      • On the other hand, it also seems kinda parsimonious. If we manage to turn all the other various "not found" reasons into a single ErrNotFound, why not keep rolling?
      • This is a thought I just had while writing the above, so I reserve the right to change opinion in the future :) Brain-dumping here at the moment. Will cook on it for a while.

I don't know what to do with LookupByNode and the coercion questions, but also think those might turn out to be an ignorable tangent for figuring out the overall strategy here.

warpfork avatar Sep 05 '20 13:09 warpfork