cubical icon indicating copy to clipboard operation
cubical copied to clipboard

Rewrite category theory code to separate structure and properties

Open mortberg opened this issue 2 months ago • 6 comments

I think the category theory in the library should be rewritten to match the way the algebra code is written. In particular, data and properties should be separated so that we have for example:

  • CategoryStr = Category structure
  • isCategory = The properties for when a category structure is a category.
  • Category = Sigma type of CategoryStr and isCategory.

I believe this could both speed things up (e.g. isCategory can pretty much always be made abstract) and we can be more careful when passing things around, e.g., functors should only depend on CategoryStr, not isCategory. This should also make Agda better at inferring things for us (issues like Agda loosing track of categories having hom-sets is less likely to occur if less things depend on it, etc...). It's also generally desired to have category theoretic structures match the way the algebraic structures are implemented.

As part of this we should also be more careful about using no-eta-equality consistently (so finishing https://github.com/agda/cubical/pull/1247 and resolving https://github.com/agda/cubical/issues/1240).

I would also like to clean up the category theoretic notations when doing this. My experience is that Agda often gets confused whether it should display a syntax or the name of the operation, leading to unreadable goals with strange mixes of notations and mixfix operations as well as unpredictable behavior when using various levels of C-u. Hopefully this gets better if we remove some syntax declarations. I also strongly dislike some of the non-standard notation like _⟅_⟆.

What do you think about this @maxsnew @ecavallo @felixwellen and everyone else who is using the CT part of the library?

mortberg avatar Oct 23 '25 07:10 mortberg

If we do this, then wild categories should also share the same CategoryStr (since their only difference, hom-types being sets, is property), so we could reuse things, for example since functors and natural transformations would only depend on CategoryStr we would not need unnecessary duplication of WildFunctor WildNatTrans and WildMonad

anshwad10 avatar Oct 23 '25 07:10 anshwad10

If we do this, then wild categories should also share the same CategoryStr (since their only difference, hom-types being sets, is property), so we could reuse things, for example since functors and natural transformations would only depend on CategoryStr we would not need unnecessary duplication of WildFunctor WildNatTrans and WildMonad

Good point! That's yet another good argument for this refactoring

mortberg avatar Oct 23 '25 07:10 mortberg

I agree, with making it more like the algebraic structures and adding no-eta-equality (which the algebraic structures should also have - I tried, it was tough to change for basic structures). I think that properties should be generally made opaque, which I implemented in some rewrite of CommAlgebras and made good experiences with (some cubical hacks we shouldn't do anyway stopped working though). Concerning notation: Quite a while ago, I made an experiment for decoupling notation from (algebraic) structures. I think this is conceptually the right way, but at the time agda's instance search wasn't ready for that. It might be now - there have been major improvements. I think it would be worth it to repeat this experiment. The basic idea is, that you add an layer of abstraction (this might become an annoyance - I don't know) that has something like things which provide a "+". And then you add instances that turn rings into things that provide "+" for example. Here is the old PR which doesn't contain much, but you can see what I had in mind: https://github.com/agda/cubical/pull/1037 Same tec would allow us to just write for all kinds of morphisms and $ for functors as well as for functions and morphisms of algebraic structures.

felixwellen avatar Oct 23 '25 07:10 felixwellen

I'm not that familiar with the style you are proposing, it sounds nice based on your description, but for concreteness can you point to some of the algebra code that is in this style already? I do think it would be a massive change, especially for me because we have a lot of code downstream in https://github.com/um-catlab/cubical-categorical-logic . So I'm not excited to do it myself but if we agree there are clear benefits then I'm ok with it.

In our cubical-categorical-logic code we are converging to a style where we try to make the properties opaque as much as possible, and so this distinction you are proposing would make that easier, as you can just make the isCategory part a top-level definition and make that opaque whereas now you have to resort to metaprogramming or make each law a separate opaque definition.

Regarding notation, I'm not sure what you are proposing. In cubical-categorical-logic, we have switched to a very module-centric style of just using infix operators defined in a module e.g., for a composition we would use f C.⋆ g and pretty much never f ⋆⟨ C ⟩ g. I checked and we the only custom syntax declarations we define are pretty much not used anymore, so I think I agree with this change as well.

maxsnew avatar Oct 23 '25 11:10 maxsnew

I misremembered how the algebra code is structured, but it's similar to what I described above: https://github.com/agda/cubical/blob/master/Cubical/Algebra/Group/Base.agda#L36

So for algebraic structures we first have the properties in a record parameterized by the (unbundled) operations. Then everything is packaged in a structure and then Sigma'd with the carrier. This way we can at least more easily make the properties opaque/abstract. I am not sure if the algebra way is the best for CT, but I would be happy to experiment with it.

Another question is where the isSet assumption should go? If we want to maximize code reuse a category should be a wild category with hom-sets. Maybe this is the way to go? (In algebra another funny issue comes up: consider vector spaces, if you do it naively as a field acting on an abelian group, then you will have two isSet proofs carried around...)

Regarding notation: something like what @felixwellen suggests for overloaded notations is probably the way to go in the long run, but I'm not experienced enough to do it myself. What you're doing in cubical-categorical-logic seems much better and predictable than what we do in the library now (i.e. use various syntax declarations), so I suggest getting rid of the syntax declarations and handle ambiguous notations using modules instead. It's going to be more verbose than what Felix has in mind, but at least it's very predictable and easy to understand and control.

mortberg avatar Oct 24 '25 07:10 mortberg

FYI the bundling of algebraic structures in the library was inspired by agda-stdlib: https://github.com/agda/agda-stdlib/blob/master/src/Algebra/Bundles.agda#L450

I had a look and the agda-categories library does not do something like this for their categories (interestingly they do have eta-equality explicit though): https://agda.github.io/agda-categories/Categories.Category.Core.html#442

The code in the 1lab is similar (but no eta): https://github.com/the1lab/1lab/blob/main/src/Cat/Base.lagda.md#L32

The agda-unimath code uses nested Sigmas, but they don't seem to separate data and properties either: https://github.com/UniMath/agda-unimath/blob/master/src/category-theory/precategories.lagda.md#L97

The CT library I'm most used to is the Rocq UniMath library where data and properties are separate (but they use nested Sigma instead of records). Note that they have wild categories (precategories) as a separate definition and then add the hom-set assumption separately: https://github.com/UniMath/UniMath/blob/master/UniMath/CategoryTheory/Core/Categories.v#L160

Maybe the Rocq UniMath way is the way to go for this library? At least I found it very nice to use (despite nested Sigmas) and somewhat predictable performance as one could consistently make properties opaque. They have eta enabled.

Maybe we should keep eta and see if this refactoring leads to speedup first? Having C^op^op=C definitionally is really nice for avoiding code duplication for dualization of concepts.

mortberg avatar Oct 24 '25 07:10 mortberg