citus icon indicating copy to clipboard operation
citus copied to clipboard

Infrastructure for UDF support in column defaults

Open hanefi opened this issue 3 years ago • 1 comments

We do not want to pass the column defaults to a worker node if it contains some UDF that may be absent on worker nodes.

This PR adds the infrastructure that is needed to re-enable setting column defaults on the worker nodes when we extend distributed object creation to support UDFs on metadata-synced nodes.

Related: #3851

TODO:

  • [ ] Fix broken tests
    • [ ] Queries sent to worker nodes no longer contain DEFAULT expressions in many files
    • [ ] The behaviour of MX is slightly different due to the MX nodes not having the column defaults
  • [ ] Fix cache reference leaks

hanefi avatar Jul 30 '21 07:07 hanefi

High-level question: Are there (still) meaningful distinctions between default 3+3, default fn(), default nextval('...') that would make us want to propagate one but not the other to shards? Or could we treat it as "include defaults" vs "do not include defaults".

  • default 3+3 can be executed on any node
  • default fn() may rely on the existence of the definition for the function. Some class of such functions require the existence of metadata in the worker nodes, and hence an extension to the distributed object propagation mechanism is needed. I am working on the design for that one. See https://github.com/citusdata/citus/issues/3064 for a function that reads from a reference table, that can be used as a column default.
  • default nextval() depends on the existence of the sequence. We do not rely on metadata here for now.

hanefi avatar Aug 02 '21 08:08 hanefi

This is already completed on a different PR

hanefi avatar Sep 23 '22 13:09 hanefi