pandas-ai
pandas-ai copied to clipboard
`exec` is risky
https://github.com/gventuri/pandas-ai/blob/95667b94361ec8101ab0ae08183e4d49930fce25/pandasai/init.py#L130-L165
we take the code generated by LLM and run it via exec
, this is extremely risky and can have data leakage or execution at the operating system level
if llm returns this:
import os
os.environ
all environment variables are exposed, even the LLM token
solution?
limit execution of modules
O problema é que o código gerado pelo LLM é executado diretamente por meio da função exec(), o que pode ser perigoso, já que pode haver vazamento de dados ou execução de código no nível do sistema operacional. Uma solução possível seria avaliar o código gerado pelo LLM em um ambiente de execução isolado, como um sandbox ou uma máquina virtual. Dessa forma, é possível limitar o acesso do código gerado a recursos sensíveis do sistema, como variáveis de ambiente, e garantir que o código seja executado com segurança. Outra solução seria analisar cuidadosamente o código gerado pelo LLM antes de executá-lo. Isso pode ser feito verificando se o código gerado não contém instruções perigosas ou maliciosas e garantindo que ele seja compatível com a política de segurança da aplicação. Também é importante lembrar que, em vez de usar a função exec(), você pode usar outras funções de execução de código mais seguras e controladas, como a função eval() do Python.
Valid suggestion @VictorGSoutoXP, although Python's eval()
can still be dangerous. More details on:
- https://stackoverflow.com/questions/661084/security-of-pythons-eval-on-untrusted-strings
- http://tav.espians.com/a-challenge-to-break-python-security.html
Here is the docs for ast.literal_eval
, another option (not rlly safe either though)
https://docs.python.org/3/library/ast.html#ast.literal_eval
Got a bit confused about this part:
This function had been documented as “safe” in the past without defining what that meant. That was misleading. This is specifically designed not to execute Python code, unlike the more general eval().
But this is serious, we should really think of a way to limit execution of modules @avelino do you suggest doing this before the exec()
call?
@avelino totally, but it's gonna be harder than expected. Couldn't find any naive approach that fixes all the use cases.
@VictorGSoutoXP idea of running it on a isolate environment makes a lot of sense.
Maybe a more naive solution would be to actually moderate the code, raising an error if any of the potential malicious code is found. This is not a perfect solution, but it's a beginning.
Same thing for the eval, which we also use within the code.
How would you find malicious code? Just checking the code str before executing eval?
Some interesting discussions here https://stackoverflow.com/questions/3068139/how-can-i-sandbox-python-in-pure-python
@Lorenzobattistela that's the idea. As of now I'm already kicking out imports, so it's more a matter of finding a list possible malicious codes and if detected in the code string, we raise an exception.
The long term fix is something closer to running it in a isolate environment imo!
Agreed, I think for now we can focus on finding this list of possible malicious code, since sandboxing seems to be pretty problematic to do. I'll keep researching
Possible solution is to limit the execution of certain modules and functions that may pose a risk to security. You can use Python's built-in sys module to achieve this. Specifically, you can use the sys.modules dictionary to restrict access to certain modules or functions that may be unsafe. For example, you could define a whitelist of modules and functions that are considered safe to execute, and then only allow those modules and functions to be imported or executed within the generated code. You could also restrict access to certain global variables, like os.environ, which may contain sensitive information. example, we define a whitelist of safe modules and functions and override the built-in import and getattr functions to restrict access to anything outside of that whitelist. We then execute the generated code with a custom global namespace that only allows access to the whitelisted modules and functions. Here's an example of how you could implement this in your code:
import sys
Define a whitelist of safe modules and functions
SAFE_MODULES = ['math', 'numpy', 'pandas'] SAFE_FUNCTIONS = ['sum', 'mean', 'median']
Define a custom dictionary for the generated code's global namespace
global_dict = {}
Restrict access to unsafe modules and functions
def custom_import(name, globals=None, locals=None, fromlist=(), level=0): if name in SAFE_MODULES: return orig_import(name, globals, locals, fromlist, level) else: raise ImportError(f"Module '{name}' is not allowed")
def custom_getattr(obj, name): if name in SAFE_FUNCTIONS: return orig_getattr(obj, name) else: raise AttributeError(f"Function '{name}' is not allowed")
Override the built-in import and getattr functions
orig_import = builtins.import builtins.import = custom_import orig_getattr = builtins.getattr builtins.getattr = custom_getattr
Execute the generated code with the custom global namespace
exec(llm_generated_code, global_dict)
Yes, you're right that eval() can still be dangerous even with input validation, and that ast.literal_eval() is a safer alternative. However, it should still be used with caution, especially if dealing with untrusted input. Regarding the confusion about ast.literal_eval(), the documentation is saying that it is safe in the sense that it only evaluates literals (e.g., strings, numbers, tuples, lists, dicts) and does not execute arbitrary code, unlike eval(). However, it still has limitations and potential risks, such as the fact that it cannot evaluate expressions that involve variables or function calls. As for limiting module execution, it would be a good idea to do this before the exec() call, as this would provide an additional layer of security. One approach would be to use the sys.modules dictionary to restrict access to unsafe modules and functions, as I mentioned earlier. Another approach would be to use a sandboxing library, such as RestrictedPython, which provides more fine-grained control over the code that can be executed. One approach to finding malicious code could be to use a code analysis tool or library, such as ast or pylint, to identify potential security vulnerabilities before executing the code. This could involve checking the code for known patterns of malicious behavior, such as accessing sensitive data, executing system commands, or importing unsafe modules. Another approach could be to use a regular expression or string matching to scan the code for known strings or patterns that are associated with malicious behavior, such as shell commands, file I/O operations, or network communication. However, these approaches may not be able to catch all possible security risks, especially if the malicious code is obfuscated or hidden within a larger code block. In such cases, a more sophisticated approach, such as using a sandboxing library, may be necessary to ensure that the code is executed safely. The discussions on the StackOverflow link you provided offer some interesting ideas and solutions for sandboxing Python code, including using the RestrictedPython library, running the code in a separate process or container, or using operating system-level restrictions to limit access to system resources. I'm a big fan of Python. I'm a software engineering student so I still have a lot to learn, and I'm always looking for challenges, if there's any work I can help with, like a summer internship or something The kind without involving money, I just want experience, I would be very happy to contribute and learn from everyone.
Yes @VictorGSoutoXP, I love your approach, I think we should try to integrate it this way. Do you want to work on this yourself?
I am eager to provide my assistance. Currently an intern at Bosh, I specialize in engineering market intelligence, and have previously worked as a software engineering intern at StoneRidge. Although I am currently a student at PUC, a reputable university in Brazil, I understand that it may not hold the same level of prestige as distinguished institutions such as Oxford or MIT. Nevertheless, I am enthusiastic about offering my assistance in any way that I can.
I am enthusiastic about offering my assistance in any way that I can
@VictorGSoutoXP opens a PR with your proposal (it seems to meet pandasai's need) so we can review its implementation
I created a branch, implemented the changes, and uploaded them. As an intern, I acknowledge that I still have significant knowledge gaps, but I am working to improve them every day. I would greatly appreciate any assistance with my professional development.
Correct me if I'm wrong but would using exec
and passing in global variables not solve this security risk? For example, if we were to do something like the following:
global_vars = {
"df": df,
"pd": pd,
"__builtins__": {
"print": print
}
}
and then in the lines in which we call exec, simply call exec(code, global_vars)
. I understand we would need more builtins than simply print, but which ones aside from import should be omitted?
... would using
exec
and passing in global variables not solve this security risk?
No, remote code execution is a fundamental security flaw that can only be addressed by sandboxing, which is often accomplished through containerization: a designated location on the file system bundled up with all the dependencies needed to run some software in a way that appears to the software to be a completely isolated system, yet still has access to system resources. Docker is a popular solution. How do others feel about starting up a local docker container to create an isolated system for arbitrary code execution? You could then configure docker to allow local network connections to communicate queries in and results out. Any malicious or harmfull code would simply break the docker container, which could be immediately restarted from the exact same image after any error information is collected.
- Define Docker Image for arbitrary code execution
- Start docker client/container through python library: https://pypi.org/project/docker/#description
- Any code to be executed is sent to the container; wait for response
- Deliver result as you would as if input had been passed to
exec
This should fix some of the security problems: https://github.com/gventuri/pandas-ai/commit/e3d7d1dc259918565c0db08d535d8fd28fa7a465
I'll write down some other edge cases that we still need to cover. Feel free to comment about any potential issues with exec
that hasn't been addressed yet.
Great! I believe that I have provided all the knowledge that is within my scope. From now on, it's up to you to apply and expand on what you have learned. Best wishes and I hope I was able to assist you in some way.
@VictorGSoutoXP thanks a lot for the contribution and the knowledge sharing. @avelino I'll close the issue as the major concerns have been addressed.
We'll improve further with this, as @TSampley reported: https://github.com/gventuri/pandas-ai/issues/73 security