ksql
ksql copied to clipboard
docs: klip-35 dynamic processing log levels
KLIP for dynamically setting the processing log level
NOTE: given some feedback to implement these APIs as a virtual table rather than a REST API, I've rewritten this KLIP assuming an implementation of system tables in ksqlDB, which doesn't actually exist yet. Pushing that out to a separate klip (KLIP-37).
+1 to Almog's suggestion from a conceptual standpoint though I have a couple concerns:
- the table won't be materialized, which mean pull queries such as
won't work. Requiring the user to addSELECT level FROM processing_log_configs WHERE logger = 'root';EMIT CHANGES;on the end doesn't feel right since it should really be a pull query. We could make this config topic a special snowflake by materializing but that seems like it would confuse users when other tables they create aren't materialized. - The user should have a way to find all keys / see all rows from the table. ksqlDB doesn't support this today, since pull queries only support single-key lookups. Again, we could require users to add
EMIT CHANGES;but this doesn't feel right.
@vcrfxia your concerns are totally valid; I think, though, we'll have to address them in either scenario (namely, reading from a topic and materializing/serving bulk lookups against it).
The question is how much special casing we want, and whether we're eventually planning on adding abilities to force materialization/perform table scans - that would mean we could go down one of two paths if we wanted to model it as a table: (1) we require EMIT CHANGES in the meantime, providing a "good enough" experience (I don't know how often people want to get the log level for all of the loggers at once), (2) special case it so that just this table is handled with a different executor and make it work OOTB, but not have parity with other tables.
@vcrfxia your concerns are totally valid; I think, though, we'll have to address them in either scenario (namely, reading from a topic and materializing/serving bulk lookups against it).
The question is how much special casing we want, and whether we're eventually planning on adding abilities to force materialization/perform table scans - that would mean we could go down one of two paths if we wanted to model it as a table: (1) we require
EMIT CHANGESin the meantime, providing a "good enough" experience (I don't know how often people want to get the log level for all of the loggers at once), (2) special case it so that just this table is handled with a different executor and make it work OOTB, but not have parity with other tables.
I think we can consider the external interface and internal implementation separately. From an external interface POV I like the idea of representing this as a table. Internally, it doesn't make sense to me to implement this like we do a regular table. Its state is mostly in log4j, w/ some overrides in the kafka topic (but being written to and read from the log4j apis). So it can just have its own executor (@agavra's path (2)) that supports a different subset of the syntax.
I'll update the proposal and re-request review.
I think we can consider the external interface and internal implementation separately. From an external interface POV I like the idea of representing this as a table. Internally, it doesn't make sense to me to implement this like we do a regular table. Its state is mostly in log4j, w/ some overrides in the kafka topic (but being written to and read from the log4j apis). So it can just have its own executor (@agavra's path (2)) that supports a different subset of the syntax.
This makes sense though I'm curious to see what the proposed implementation looks like. It sounds like the processing_log_configs "table" won't actually be a table in that it won't appear in the result of list tables;, users won't be able to derive other tables from it, and the only queries that may be issued against it come from a pre-defined set. We'll need to reserve the name of the "table" so users can't create a stream or table with the same name, and query requests against this "table" will need to be routed to a different executor. Some of these details might be too low-level for the KLIP, though.
I think we can consider the external interface and internal implementation separately. From an external interface POV I like the idea of representing this as a table. Internally, it doesn't make sense to me to implement this like we do a regular table. Its state is mostly in log4j, w/ some overrides in the kafka topic (but being written to and read from the log4j apis). So it can just have its own executor (@agavra's path (2)) that supports a different subset of the syntax.
This makes sense though I'm curious to see what the proposed implementation looks like. It sounds like the
processing_log_configs"table" won't actually be a table in that it won't appear in the result oflist tables;, users won't be able to derive other tables from it, and the only queries that may be issued against it come from a pre-defined set. We'll need to reserve the name of the "table" so users can't create a stream or table with the same name, and query requests against this "table" will need to be routed to a different executor. Some of these details might be too low-level for the KLIP, though.
Nah I think those are all things we should hash out at the klip level. Was planning to address them in the next update.
Our first system table... yah! I've been wanting to do this for some time, e.g show tables should really be SELECT * FROM KSQL_USER_TABLES; or similar.
We may want to reserver stream/table names that start with KSQL_ for future use.
It should be possible to use the same pull query executor, combined with a different table implementation, to handle queries against these tables.
@vcrfxia, you're right these tables won't be available to be combined with other tables (or other system tables) initially. But it's still a better API than RESTful for a SQL system. And we can always extend this in the future.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.