ChainRulesCore.jl icon indicating copy to clipboard operation
ChainRulesCore.jl copied to clipboard

Don't allow Tangent{T} where T is an Array or a primative type?

Open oxinabox opened this issue 4 years ago • 8 comments

Similar in spirit to #495 . Tangent is for composite objects. So maybe we shouldn't let people try and put objects that are not make up of fields inside it.

Potentially we could detect this with feildcount or isprimativetype + isa Array

julia> fieldcount(typeof([1]))
0

julia> fieldcount(typeof(1))
0

julia> isprimitivetype(typeof([1]))
false

julia> isprimitivetype(typeof(1))
true

oxinabox avatar Oct 19 '21 12:10 oxinabox

Is this what we want? Specifically, I don't see why someone shouldn't make a new type T "primitive" in our system (by doing stuff with getfield or whatever), and use a Tangent{P, T} for some type T for which isprimitivetype(T) == true to represent its tangents.

Tangent is for composite objects.

Phrased differently, why do we require this? I can't immediately see the harm in not requiring it 🤷 .

willtebbutt avatar Oct 19 '21 12:10 willtebbutt

In julia if you have isprimativetype(T) == true then you can't do getfield, primative types (and Arrays (but not other AbstractArrays)) do not have fields. You might be able to do getproperty if you define it.

Tangent is for composite objects.

Phrased differently, why do we require this? I can't immediately see the harm in not requiring it 🤷 .

We kinda can't not. The Tangent{T} structural tangent type, is a mapping from the field-names of the primal to their tangent values. Since primatives (and Arrays ) have no fields, we can't do that.

Something else can be made to work like just making up fieldnames and manipulating them directly, but at that point a lot of the machinery is getting bypassed. Might as well just go full custom tangent type.

Most primatives Bool, Int, UInt have NoTangent(), or are their own tangent which is natural (and arguably also structural in the sense of matching the structure), like Float64 (and Array also is in this category).

oxinabox avatar Oct 19 '21 13:10 oxinabox

In julia if you have isprimativetype(T) == true then you can't do getfield, primative types (and Arrays (but not other AbstractArrays)) do not have fields.

Sorry, I'm talking about two distinct notions of primitives here. There's what Julia calls a primitive type (Base.isprimitivetype tells us about this), and there's what we call a primitive type for the sake of AD. Perhaps we should pick a different name for primitives in AD to avoid this naming overload.

We haven't written down a precise definition of what we call a "primitive" for the sake of AD, but I'm taking it to be anything where we manually specify the tangent type, rather than derive it recursively from its fieldnames. I think these can be divided into exactly two distinct categories:

  1. Anything that must be an AD primitive because we can't derive a tangent type automatically from the primal type. I believe the only types for which this occurs are what Julia calls primitive.
  2. Anything that we choose to be an AD primitive. Dicts are our only example of this, but Complex should probably also be.

Note that this means that

  1. "Base.isprimativetype(P) == true" implies "P is a primative for AD", but
  2. "P is a primative for AD" does not imply that "Base.isprimativetype(P) == true".

I'm ignoring the natural vs structural discussion here, which to my mind is something a bit different, so I don't believe it's relevant to this discussion.

The Tangent{T} structural tangent type, is a mapping from the field-names of the primal to their tangent values.

Well that's how we treat it most of the time. But it's not strictly the case all of the time -- Dicts are a good example here.

Since primatives (and Arrays ) have no fields, we can't do that.

Indeed -- I mean I'm thinking of an Array as being a primitive in the AD sense, as opposed to the Base.isprimitivetype sense. Hopefully the above clarifies.

Something else can be made to work like just making up fieldnames and manipulating them directly, but at that point a lot of the machinery is getting bypassed. Might as well just go full custom tangent type.

I think this is essentially what I was suggesting. Clearly, you can't expect the default implementations to work in a Tangent{P, T} if Base.isprimitivetype(T) == true, but if you implement the interface (+, ::Number * T, etc) everything should work fine I think.

Most primatives Bool, Int, UInt have NoTangent(), or are their own tangent which is natural (and arguably also structural in the sense of matching the structure), like Float64 (and Array also is in this category).

Indeed. I think the most general way to think about this (partly repeating what you're saying) is that

  1. any P for which Base.isprimitivetype(P) == true must manually define a tangent type T,
  2. often it will be NoTangent,
  3. sometimes it will be P,
  4. but there's no reason it couldn't be literally any other type in existance provided that it implements + and * (and any other functions that I've missed that we require tangents to implement).

willtebbutt avatar Oct 19 '21 14:10 willtebbutt

Sorry, I'm talking about two distinct notions of primitives he

We are not talking about AD Primatives here though, there is no relevance.

You seem to have wandered off the topic here.

Well that's how we treat it most of the time. But it's not strictly the case all of the time -- Dicts are a good example here.

Dict is not a primative type (in the julia sense). So we are making no statement in this issue about Dict, and what should or should not be allowed.

This is an issue only about making things like Tangent{Int}(;), and Tangent{Float64}(;) and Tangent{Array{Float64, 1}}(;) throw errors. Rather than giving back an object that is very weird in it's behavior.

This is a small little issue, not a board sweeping discussion about AD primatives and what it means to be a tangent. We have other issues about those, lets not fragement discussion further.

oxinabox avatar Oct 19 '21 14:10 oxinabox

there is no relevance.

I disagree. If you want to answer the question "should we allow Tangent{P} for some Julia primitive P?", you have to know whether or not it's a legitimate thing to do in the first place.

I'm saying that I can see no good reason to prevent someone from declaring "my Julia-primitive type P will have tangents of type Tangent{P}, and I'm going to ensure that everything is defined as it ought to be to make this work out", so I don't think we should disallow it. To establish this, I felt I needed to talk about what we mean by an AD primitive, because to my mind a thing is an AD primitive precisely if its tangent type is manually defined.

I agree it wound up being long winded.

This is an issue only about making things like Tangent{Int}(;), and Tangent{Float64}(;) and Tangent{Array{Float64, 1}}(;) throw errors.

I agree that these examples would ideally throw errors. The question is why. It is because

  1. they are primitive Julia types, or
  2. they're primitive within our AD system and we've defined their permissible tangent types to be something other than Tangent{P}?

To my mind, 2 is the important thing. 1 is an implementation detail that happens to be correlated with 1 in practice.

Dict is not a primative type (in the julia sense). So we are making no statement in this issue about Dict, and what should or should not be allowed.

My reasoning for dragging Dicts into this is:

  1. I think of Dicts as being AD primitives
  2. Anything that is a Julia primitive is also an AD primitive
  3. 1 + 2 => Dicts and Base.isprimitivetype types are the same kind of thing when we're talking about tangent types
  4. Dicts use Tangents in a weird slightly non-standard way to represent their tangents, so it doesn't seem like much of a leap to me that other AD primitives might like to use Tangents for their tangents
  5. Tangent{P} seems like a good name for the type of a tangent of P
  6. 3 + 4 + 5 => if you give me a primitive, I might want to make Tangent{P} its tangent.

If you're not convinced by this line of reasoning, that's of course fine.

Anyway, I agree that Tangent{Float64} should really not be a thing, so I would be fully in favour of explicitly preventing things like this from being constructed for specific types where we know that we don't want this. I would not be in favour of doing it on the basis of Base.isprimitivetype for the above reasons. Fortunately, there aren't many isprimitivetype types, so this won't be a massive effort.

willtebbutt avatar Oct 19 '21 15:10 willtebbutt

This is an issue only about making things like Tangent{Int}(;), and Tangent{Float64}(;) and Tangent{Array{Float64, 1}}(;) throw errors.

I agree that these examples would ideally throw errors. The question is why. It is because

1. they are primitive Julia types, or
2. they're primitive within our AD system and we've defined their permissible tangent types to be something other than Tangent{P}?

I would argue it is 1. It is 1. Because they are not objects with fields, and thus can't be represented with Tangent{T}. Because Tangent{T} operations are not defined for objects without fields.

"my Julia-primitive type P will have tangents of type Tangent{P}, and I'm going to ensure that everything is defined as it ought to be to make this work out"

I mean I can see where you are coming from. I think it is worth remembering declaring a custom primative type is an obscure feature.

To my knowledge there are 4 places in the ecosystem that declare primative types.

  1. Base uses it to define some types (and fakes using it from C for others)
  2. BitIntegers.jl uses it to define larger integer types
  3. InlineStrings.jl uses it to have a blob of heap memory that it keeps a string on.
  4. DexCall uses it to have a blob of memory that it can reinterpret in order to represent a TaggedUnion for purposes of C API compatibility

None of these use case are at all like Dict or like struct or Tuple (and I am not really that sure that something like that could be made that uses primative, I guess in theory someone might build some kind of dynamicly bit-packed struct that holds things of different sizes but I don't thing they would: parametric structs get so close to that anyway.)

Dictionaries are kinda a special snowflake of their own. It is basically a happy coincidence that we can represent them using a Tangent. It works because they basically act like a composite data type that has variable number of fields. Which means making it work as Tangent lets us save a lot of code -- we don't have to write This is not the case for other related data structures like tries for example. It is the case for Dictionaries.jl's types though. But those are not a primative types. So this proposal would not affect them.

They would however run into the requirement that they are backed by NamedTuples since we have done #495 but we can deal with it when/if it comes up. (Similarly, I am pretty sure there are a number of cases where Tangent{<:Dict} doesn't work right, but there are no AD systems, nor rules using it AFAIK)

oxinabox avatar Oct 19 '21 15:10 oxinabox

I would argue it is 1. It is 1. Because they are not objects with fields, and thus can't be represented with Tangent{T}. Because Tangent{T} operations are not defined for objects without fields.

This doesn't make sense to me.

There is nothing stopping me from implementing +(::Tangent{P}, ::Tangent{P}) and *(::Number, ::Tangent{P}) for my primitive type P, over-riding all of the defaults, or am I missing something?

That being said, if you want to outlaw types for which Base.isprimitivetype is true from using Tangent{P}s as their tangents because they don't fit our preferred semantics for Tangents, as opposed to because they couldn't be made to work on a technical level, that's fine by me. As you say, it's an obscure feature, and it's never going to be the case that someone who implements a new primitive Julia type needs to use Tangent, so this wouldn't ever be a show stopper.

willtebbutt avatar Oct 19 '21 16:10 willtebbutt

Also, it would be extremely poor form to do what I'm suggesting someone might want to do, since you really ought not change the semantics of a function defined on a type when adding specialised methods for different type parameters.

So I think you're right. We should ban it.

willtebbutt avatar Oct 19 '21 16:10 willtebbutt