greptimedb icon indicating copy to clipboard operation
greptimedb copied to clipboard

Be able to invoke Python UDF directly in SQL query

Open waynexia opened this issue 3 years ago • 2 comments

What problem does the new feature solve?

GreptimeDB supports users to write post-process logic in Python (official document) like

@coprocessor(args=["host", "idc", "cpu_util", "memory_util", "disk_util"],
             returns = ["host", "idc", "status"],
             sql = "SELECT * FROM system_metrics")
def calc_status(hosts, idcs, cpus, memories, disks):
    statuses = []
    for host, cpu, memory, disk in zip(hosts, cpus, memories, disks):
        if cpu > 80 or memory > 80 or disk > 80:
            statuses.append("red")
            continue
        status = cpu * 0.4 + memory * 0.4 + disk * 0.2

    return hosts, idcs, statuses

In this example script it runs on the result of SQL SELECT * FROM system_metrics.

And we can make it more amazing if the customized process logic can be integrated directly into the SQL itself, like

SELECT
  host,
  idc,
  calc_status(cpu_util, memory_util, disk_util)
FROM
  system_metrics;

What does the feature do?

Give the ability to embed pre-defined UDF scripts into SQL query

Implementation challenges

No response

waynexia avatar Nov 30 '22 13:11 waynexia

We can make udf and script different objects in database.

Stored procedure in snowflake: https://docs.snowflake.com/en/sql-reference/stored-procedures.html UDFs in snowflake: https://docs.snowflake.com/en/sql-reference/user-defined-functions.html

sunng87 avatar Dec 01 '22 02:12 sunng87

We can make udf and script different objects in database.

Stored procedure in snowflake: https://docs.snowflake.com/en/sql-reference/stored-procedures.html UDFs in snowflake: https://docs.snowflake.com/en/sql-reference/user-defined-functions.html

:+1: This looks great!

CREATE OR REPLACE PROCEDURE MYPROC(from_table STRING, to_table STRING, count INT)
  RETURNS STRING
  LANGUAGE PYTHON
  RUNTIME_VERSION = '3.8'
  PACKAGES = ('snowflake-snowpark-python')
  HANDLER = 'run'
AS
$$
def run(session, from_table, to_table, count):
  session.table(from_table).limit(count).write.save_as_table(to_table)
  return "SUCCESS"
$$;

waynexia avatar Dec 01 '22 02:12 waynexia