Idris2 icon indicating copy to clipboard operation
Idris2 copied to clipboard

Support fallback to a non-foreign function if not implemented for the current codegen

Open gwerbin opened this issue 2 years ago • 4 comments

  • [x] I have read CONTRIBUTING.md.
  • [x] I have checked that there is no existing PR/issue about my proposal.

Summary

I propose that there should be a mechanism for the compiler to fall back to a non-foreign function if no foreign function is defined for the current backend.

For example:

module MyApp.Utils

import Data.Vect

%foreign "scheme,chez:idris-prim-listToVect"
prim__listToVect : List t -> (Nat, Vect n t)

listToVect : List t -> (n ** Vect n t)
listToVect ls =
  let (n, vs) = prim__listToVect {n = 0} ls
  in  MkDPair n (believe_me vs)

(Thank you to HTNW on Stackexchange for helping me with this believe_me incantation: https://stackoverflow.com/a/72075133/2954547)

Compilation of this module will fail if using any backend other than Chez. I propose that we should have a mechanism to specify a pure Idris function to fall back to, if using an unsupported backend.

Motivation

This is useful in cases where a native function is not required, but can or should be used, e.g. for performance reasons or for numerical stability of floating-point operations.

In the above example, you can implement a pure Idris version, at the cost of a linear scan over the data, which could be expensive if not optimized away by the backend:

listToVect : (xxs : List t) -> (n ** Vect n t)
listToVect [] = (0 ** [])
listToVect (x :: xs) = let (_ ** ys) = listToVect xs in (_ ** x :: ys)

0 prf_listToVect_sameLength : (xxs : List t) -> length xxs = fst (listToVect xxs)
prf_listToVect_sameLength [] = Refl
prf_listToVect_sameLength (x :: xs)
  with (prf_listToVect_sameLength xs) | 0 (listToVect xs)
  _ | inductionHypothesis | (k **ys) = cong S inductionHypothesis

(Many many thanks to Russoul and Ohad for their help with the proof!)

The above case is especially interesting because it demonstrates that you might want to be able to choose between "as fast and efficient as the backend allows" and "slower but proven correct".

The proposal

I can think of a few ways to approach this:

  1. Add a separate "pseudo-backend" called idris, which just contains the name of a function to fall back to.
  2. Add a separate %foreign-fallback directive, which specifies the name of a function to fall back to.
  3. If the function annotated with %foreign has a body/implementation, use that.

The 3rd option might be elegant, but it would maybe be weird to figure out how to use it in proofs.

The 1st option seems like it's the easiest to work with.

The 2nd option seems like a bit of a compromise, but then the user wouldn't need to write a string literal in the directive, e.g. %foreign-fallback Foo.Bar.Quux vs. %foreign "Foo.Bar.Quux".

Examples

module MyApp.Utils

import Data.Vect

listToVect : (xxs : List t) -> (n ** Vect n t)
listToVect [] = (0 ** [])
listToVect (x :: xs) = let (_ ** ys) = listToVect xs in (_ ** x :: ys)

0 prf_listToVect_sameLength : (xxs : List t) -> length xxs = fst (listToVect xxs)
prf_listToVect_sameLength [] = Refl
prf_listToVect_sameLength (x :: xs)
  with (prf_listToVect_sameLength xs) | 0 (listToVect xs)
  _ | inductionHypothesis | (k **ys) = cong S inductionHypothesis

%foreign "scheme,chez:idris-prim-listToVect"
         "idris:listToVect"
prim__listToVect : List t -> (Nat, Vect n t)

fastListToVect : List t -> (n ** Vect n t)
fastListToVect ls =
  let (n, vs) = prim__listToVect {n = 0} ls
  in  MkDPair n (believe_me vs)

%transform "listToVect" listToVect = fastListToVect

It would of course be helpful to use Hedgehog or similar to test that fastListToVect xs == listToVect xs over a wide variety of inputs.

Technical implementation

I'm not sure how difficult this would be. There might have to be some limitations imposed in order to prevent cyclical references, etc.

Alternatives considered

I couldn't come up with a workaround for this issue. I listed 3 possible solutions, but there might be better ideas out there.

Conclusion

I think this would help write performant Idris programs, especially when paired with %transform, while also easing the developer burden with respect to supporting different backends.

This could be useful when using e.g. the Node backend for debugging, but the Scheme backend for final compilation. There might also be times when someone wants to use the Racket backend (Windows?), but no Racket foreign function implementation exists.

It could also imagine use cases in distributing "scientific" Idris 2 libraries.

gwerbin avatar May 17 '22 03:05 gwerbin

prim__listToVect : List t -> (Nat, Vect n t)
fastListToVect : List t -> (n ** Vect n t)

These two type signatures are fundamentally different: in the first one the caller can pick the length n that the returned vector should have whereas in the second one the function is returning a length n together with a vector of that length.

gallais avatar May 17 '22 10:05 gallais

prim__listToVect : List t -> (Nat, Vect n t)
fastListToVect : List t -> (n ** Vect n t)

These two type signatures are fundamentally different: in the first one the caller can pick the length n that the returned vector should have whereas in the second one the function is returning a length n together with a vector of that length.

That was why I had to use believe_me. I think we discussed this in Discord, and I am probably being stubborn in using this particular example. As always, I like to learn better ways to do things!

Do you feel like that invalidates the general use case being proposed? It just happened to be the example I had in mind when I wrote the issue.

gwerbin avatar May 18 '22 02:05 gwerbin

As I said on Discord, I think chasing that particular use of the FFI is a massive premature optimisation. In the scheme backend, both List and Vect are probably represented using scheme lists and so fromList is essentially an identity function (and may even be entirely optimised away by Idris2). So the length-based solution should be a single traversal anyway.

So, yeah, I think this example is badly designed because:

  1. it suggests a bad understanding of the distinction between Pair and DPair, caller vs. callee
  2. it's probably not needed at all

As for the feature request, I have the impression it's a feature in search of use case rather than a use case in search of a feature. We can already replace compile-time definitions with different runtime FFI calls, just not conditionally on the backend.

gallais avatar May 18 '22 07:05 gallais

This is known as "polyfill" in browser.

One way to achieve this is to import different modules depending on backend. You may have

foo = if __backend__ == "scheme" then prim__foo else fallback__foo

__backend__ is just the backend string.

Where __backend__ == "scheme" is optimized away at compile time.

This is like builtin in Zig. It's a typed term with values defined at compile time. This is cleaner than %foreign, since you can use any language feature at compile-time.

For example, you have change the FFI definition depending on platform. You can use @extern in any scope to declare foreign function.

iacore avatar Jun 10 '22 18:06 iacore