[AGE-2919] Support more libraries in custom evaluators in Python
Discussed in https://github.com/Agenta-AI/agenta/discussions/2691
Originally posted by mmabrouk July 21, 2025
Problem
You can currently create custom evaluators using Python code, however the libraries that can be used are heavily limited due to the use of RestrictedPython. It would be great if we could use any library in custom code.
@badcandy From Slack
I really like that Agenta allows defining custom evaluators in Python and registering them so they can be used within the UI during evaluation. However, I noticed that due to the restrictions of RestrictedPython, the libraries available for use are limited. I have two questions regarding this: Is there a way to use external libraries like scikit-learn in a custom evaluator, or is it currently not possible?
Right now, custom evaluators in Agenta are limited to a small set of whitelisted libraries: math, random, datetime, json, requests, and typing. This is enforced by the use of RestrictedPython, which runs evaluator code in a sandbox and only allows imports from that hardcoded list. External libraries like scikit-learn can't be used in custom evaluators because they're not included in the allowed modules, and there isn't a way to extend this list or configure additional imports in the current design. All evaluator code is routed through these sandboxing controls for security reasons, and there are no hidden workarounds or exceptions for these restrictions. You can see the implementation details in the sandbox code here.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other