feat(query): support pep723 scripts in python udf scripts
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
support pep723 scripts in python udf scripts
From https://x.com/charliermarsh/status/1934433431431139342
We can un uv add --script /path/to/script.py to add inline dependencies to a Python file. If the script header doesn't exist already, uv will generate it for you.
For example, we can modify the gcd function to introduce numpy and pandas packages and do some useless work with these packages.
CREATE OR REPLACE FUNCTION gcd_py (INT, INT) RETURNS BIGINT LANGUAGE python HANDLER = 'gcd' AS $$
# /// script
# requires-python = ">=3.12"
# dependencies = ["numpy", "pandas"]
# ///
import numpy as np
import pandas as pd
def gcd(a: int, b: int) -> int:
x = int(pd.DataFrame(np.random.rand(3, 3)).sum().sum())
a += x
b -= x
a -= x
b += x
while b:
a, b = b, a % b
return a
$$;
The python executor will work fine for the script
๐ณ root@default:) select gcd_py(40, 12);
โญโโโโโโโโโโโโโโโโโโฎ
โ gcd_py(40, 12) โ
โ Nullable(Int64) โ
โโโโโโโโโโโโโโโโโโโค
โ 4 โ
โฐโโโโโโโโโโโโโโโโโโฏ
Tests
- [ ] Unit Test
- [x] Logic Test
- [ ] Benchmark Test
- [ ] No Test - Explain why
Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Breaking Change (fix or feature that could cause existing functionality not to work as expected)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
After this pr, the docker image must contain uv tool if it's built with python-udf feature. @everpcpc
After this pr, the docker image must contain
uvtool if it's built with python-udf feature. @everpcpc
Maybe we could just use python -m venv venv?