chronon icon indicating copy to clipboard operation
chronon copied to clipboard

Python wrappers for source creation

Open pyalex opened this issue 5 months ago • 5 comments

Summary

Adding functions to the Python SDK for source objects creation. This is a fully backward-compatible change. Users can continue to use both thrift-based classes and new Python wrappers.

Why / Goal

The primary motivation is to enable the addition of extra attributes at the source level. Similarly to how it's done in GroupBy and Join: all extra arguments are stored in the customJson attribute in Thrift. Sources can have all sorts of metadata, ie bootstrap.server for Kafka source, which can be helpful for a streaming job.

Additional benefits:

  • Less verbose API before:
my_source = ttypes.Source(
    events=ttypes.EventSource(
          table=...
    )
)

after:

my_source = source.EventSource(
    table=...
)
  • Improving API consistency: existing Python wrappers (ie, GroupBy, Join) use Pythonic snake case for parameter names, whereas code generated from Thrift uses camel case (ie, snapshotTable in EntitySource)
  • Omitting a required attribute will produce a more meaningful error

Test Plan

  • [ ] Added Unit Tests
  • [ x ] Covered by existing CI
  • [ ] Integration tested

Checklist

  • [ ] Documentation update

Reviewers

pyalex avatar Jul 18 '25 17:07 pyalex

Hey @nikhil-zlai , thanks for the review! There's more use for those extra attributes, than just Kafka host and port. For example, I want to store the Avro JSON schema near the source definition and attach it to the source. Or specify all kinds of Kafka consumer properties.

TopicInfo has limited usage since it makes / and = special symbols, and if I were to add anything encoded with base64 to this topic string, it would simply break.

pyalex avatar Jul 18 '25 18:07 pyalex

Updated docs

pyalex avatar Jul 18 '25 20:07 pyalex

I want to store the Avro JSON schema near the source definition and attach it to the source. Or specify all kinds of Kafka consumer properties.

I see. That definitely justifies the change.

nikhil-zlai avatar Jul 18 '25 22:07 nikhil-zlai

@hzding621, please take another look

pyalex avatar Jul 23 '25 15:07 pyalex

Ping @hzding621

pyalex avatar Aug 04 '25 19:08 pyalex