Teg Confusing result

Love this paper :-) first time trying the code.

Here's my problem set up.

E = ½ ∫₀¹ ([x<a] - [x<t])² dx ∂E/∂t = t>a?+1:-1 = [0<t<1] - 2 [t<a]

I claim that ∂E/∂t = [0<t<1] - 2 [t < α] → if t is smaller (bigger) than α then I grow my energy by making it more smaller (bigger) than α. This is confirmed with F.D.

When I try this in Teg, I wrote:

from teg import TegVar, Var, Teg, IfElse
from teg.derivs import FwdDeriv
from teg.eval.numpy_eval import evaluate
x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.25)
expr = 0.5*Teg(0, 
               1,
               (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2,
               x)
deriv_expr = FwdDeriv(expr, [(t, 1)])
print(evaluate(deriv_expr))

x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.75)
expr = 0.5*Teg(0, 
               1,
               (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2,
               x)
deriv_expr = FwdDeriv(expr, [(t, 1)])
print(evaluate(deriv_expr))

Which prints

-1.0
0

I'm confused by the 0 which I expected to be 1.0

cc @squidrice21

Mar 27 '24 03:03 alecjacobson

Hi Alec,

I don’t think I’m deep enough in the code to fix it up; maybe Jesse, Sai or Kevin have better insight. We expect that there are various practical bugs and probably also conceptual bugs in the project. For instance, there is at least one significant error in the paper w.r.t. diffeomorphisms and the ability to perform repeated derivatives. (I don’t think that’s involved here, but just to give an example)

— Gilbert

On Mar 26, 2024, at 8:36 PM, Alec Jacobson @.***> wrote:

Love this paper :-) first time trying the code.

Here's my problem set up. image.png (view on web) https://github.com/ChezJrk/Teg/assets/2241689/f33729d6-8bcd-4320-a49e-f5c4b1d40e1e E = ½ ∫₀¹ ([x<a] - [x<t])² dx ∂E/∂t = t>a?+1:-1 = [0<t<1] - 2 [t<a]

I claim that ∂E/∂t = [0<t<1] - 2 [t < α] → if t is smaller (bigger) than α then I grow my energy by making it more smaller (bigger) than α. This is confirmed with F.D.

When I try this in Teg, I wrote:

from teg import TegVar, Var, Teg, IfElse from teg.derivs import FwdDeriv from teg.eval.numpy_eval import evaluate x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.25) expr = 0.5*Teg(0, 1, (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2, x) deriv_expr = FwdDeriv(expr, [(t, 1)]) print(evaluate(deriv_expr))

x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.75) expr = 0.5*Teg(0, 1, (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2, x) deriv_expr = FwdDeriv(expr, [(t, 1)]) print(evaluate(deriv_expr)) Which prints

-1.0 0 I'm confused by the 0 which I expected to be 1.0

cc @squidrice21 https://github.com/squidrice21 — Reply to this email directly, view it on GitHub https://github.com/ChezJrk/Teg/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKZHLDMKQI7JJ7FJRTHCQTY2I5EVAVCNFSM6AAAAABFKDDYOSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDSNZVGY4DOOA. You are receiving this because you are subscribed to this thread.

Mar 27 '24 04:03 gilbo

(Previous comment was a blanket disclaimer)

If I look specifically at the code you wrote, this expression ([x<a] - [x<t])² may lead to a serious issue because it leads to products of degenerate conditionals, e.g. [x<a][x<a] or [x<t][x<t]. While in principle these might be detected and simplified to a single copy of said condition, in general it is not possible to detect all geometric degeneracies between discontinuities of functions. In the theory bit of the paper, the claim of correctness is made conditional on the absence of such degeneracies (stated as a transversal intersection condition between manifolds of discontinuity).

If you have thoughts on ways to address the degeneracy problem, that would definitely be a worthwhile path forward from the Teg paper. When we looked into the Schwarz distribution theory around this, we realized it was a pretty deep problem. Multiplying distributions is not well-defined in general. ( https://en.wikipedia.org/wiki/Distribution_(mathematics)#Problem_of_multiplying_distributions ) We spent some time looking at Colombeau’s approach to this issue, but didn’t manage to find a way to apply those ideas. (I’m not sure we found them ultimately satisfying either)

— Gilbert

On Mar 26, 2024, at 9:38 PM, Gilbert Bernstein @.***> wrote:

Hi Alec,

I don’t think I’m deep enough in the code to fix it up; maybe Jesse, Sai or Kevin have better insight. We expect that there are various practical bugs and probably also conceptual bugs in the project. For instance, there is at least one significant error in the paper w.r.t. diffeomorphisms and the ability to perform repeated derivatives. (I don’t think that’s involved here, but just to give an example)

— Gilbert

On Mar 26, 2024, at 8:36 PM, Alec Jacobson @.***> wrote:

Love this paper :-) first time trying the code.

Here's my problem set up. image.png (view on web) https://github.com/ChezJrk/Teg/assets/2241689/f33729d6-8bcd-4320-a49e-f5c4b1d40e1e E = ½ ∫₀¹ ([x<a] - [x<t])² dx ∂E/∂t = t>a?+1:-1 = [0<t<1] - 2 [t<a]

I claim that ∂E/∂t = [0<t<1] - 2 [t < α] → if t is smaller (bigger) than α then I grow my energy by making it more smaller (bigger) than α. This is confirmed with F.D.

When I try this in Teg, I wrote:

from teg import TegVar, Var, Teg, IfElse from teg.derivs import FwdDeriv from teg.eval.numpy_eval import evaluate x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.25) expr = 0.5*Teg(0, 1, (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2, x) deriv_expr = FwdDeriv(expr, [(t, 1)]) print(evaluate(deriv_expr))

x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.75) expr = 0.5*Teg(0, 1, (IfElse(x<t,1,0) - IfElse(x<a,1,0))**2, x) deriv_expr = FwdDeriv(expr, [(t, 1)]) print(evaluate(deriv_expr)) Which prints

-1.0 0 I'm confused by the 0 which I expected to be 1.0

cc @squidrice21 https://github.com/squidrice21 — Reply to this email directly, view it on GitHub https://github.com/ChezJrk/Teg/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKZHLDMKQI7JJ7FJRTHCQTY2I5EVAVCNFSM6AAAAABFKDDYOSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDSNZVGY4DOOA. You are receiving this because you are subscribed to this thread.

Mar 27 '24 04:03 gilbo

Thanks, Gilbert. What's a good rule I can follow to be sure that I provide valid input to Teg? No nested IfElse?

Mar 27 '24 13:03 alecjacobson

Just to verify the issue with products of degenerate conditionals. By manually simplifying such conditionals into:

from teg import TegVar, Var, Teg, IfElse
from teg.derivs import FwdDeriv
from teg.eval.numpy_eval import evaluate

x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.25)
expr = 0.5*Teg(0, 
               1,
               IfElse(x<t,1,0) - 2 * IfElse(x<t,1,0) * IfElse(x<a,1,0) + IfElse(x<a,1,0),
               x)
deriv_expr = FwdDeriv(expr, [(t, 1)])
print(evaluate(deriv_expr))

x, a, t = TegVar('x'), Var('a', 0.5), Var('t', 0.75)
expr = 0.5*Teg(0, 
               1,
               IfElse(x<t,1,0) - 2 * IfElse(x<t,1,0) * IfElse(x<a,1,0) + IfElse(x<a,1,0),
               x)
deriv_expr = FwdDeriv(expr, [(t, 1)])
print(evaluate(deriv_expr))

The script now gives the correct results:

-0.5
0.5

So a solution within the current Teg scope is to manually simplify such conditionals.

Mar 29 '24 16:03 squidrice21

Hi @alecjacobson and @squidrice21,

Thank you for the wonderful question.

Why is the answer wrong and what's a way to check this?

The derivative violates the transversality condition and therefore there is no guarantee of correct results.

In particular, when there's a multiplication of the same condition in the derivative, the answer may be wrong. So the problem is that when you differentiate f([t > x]) you get f'([t > x])delta(t - x), which will evaluate the condition at exactly the location of the jump. If you're theoretically inclined, the reason is that Leibniz's product rule only holds for distributions that satisfy the transversality condition.

In general, that's how I think about the problem. If Teg evaluated at a jump, then the derivative might be wrong.

How can a user/compiler systematically resolve this problem?

The key idea is to pull the conditional outside of the composition prior to differentiation: f([t > x]) = [t > x]f(1) + [t <= x]f(0). The derivative is then delta(t - x)f(1) - delta(x - t)f(0), which I believe is correct.

I'd call this hoisting conditionals. Currently, it's on the programmer to do this, but this process could be automated. Some care would need to be taken to avoid exponential explosion when hoisting terms like f([t > x] + [t + 1 > x]).

I'm happy to think about it more/help out. Feel free to lmk if you have more questions

Mar 29 '24 17:03 martinjm97

Please also note that determining whether or not two conditionals are degenerate is undecidable in the general case. So “can be automated” should be taken with a grain of salt here.On Mar 29, 2024, at 10:04 AM, Jesse Michel @.***> wrote: Hi @alecjacobson and @squidrice21, Thank you for the wonderful question.

Why is the answer wrong and what's a way to check this?

The derivative violates the transversality condition and therefore there is no guarantee of correct results. In particular, when there's a multiplication of the same condition in the derivative, the answer may be wrong. So the problem is that when you differentiate f([t > x]) you get f'([t > x])delta(t - x), which will evaluate the condition at exactly the location of the jump. In general, that's how I think about the problem. If Teg evaluated at a jump, then the derivative might be wrong.

How can a user/compiler systematically resolve this problem?

The key idea is to pull the conditional outside of the composition prior to differentiation: f([t > x]) = [t > x]f(1) + [t <= x]f(0). The derivative is then delta(t - x)f(1) - delta(x - t)f(0), which I believe is correct. I'd call this hoisting conditionals. Currently, it's on the programmer to do this, but this process could be automated. Some care would need to be taken to avoid exponential explosion when hoisting terms like f([t > x] + [t + 1 > x]). I'm happy to think about it more/help out. Feel free to lmk if you have more questions

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Mar 29 '24 17:03 gilbo

@gilbo I'm talking explicitly about the case of f([x < t]), where the degeneracy arises from the chain rule. I'm not talking about taking a pair of arbitrary conditionals and checking if they're transverse.

I believe the former case can be automated while the latter case is not computable.

This is not a complete solution to all possible degeneracies, but it is a resolution for a class of degeneracies that seem to show up.

Mar 29 '24 17:03 martinjm97

@alecjacobson,

To answer your question directly:

What's a good rule I can follow to be sure that I provide valid input to Teg? No nested IfElse?

For the current implementation, don't put parametric discontinuities as an input to a function (e.g. the squaring function in your example). Luckily, there's a manual rewrite for this case that can be automated (hoisting conditionals).

Mar 29 '24 17:03 martinjm97

Teg Teg copied to clipboard

Confusing result

Teg
Teg copied to clipboard