langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Mitigate issue #5872 (Prompt injection -> RCE in PAL chain)

Open boazwasserman opened this issue 1 year ago • 7 comments

Adds some selective security controls to the PAL chain:

  1. Prevent imports
  2. Prevent arbitrary execution commands
  3. Enforce execution time limit (prevents DOS and long sessions where the flow is hijacked like remote shell)
  4. Enforce the existence of the solution expression in the code

This is done mostly by static analysis of the code using the ast library.

Also added tests to the pal chain.

Fixes #5872

@vowelparrot

boazwasserman avatar Jun 11 '23 14:06 boazwasserman

I just got here from a Twitter link that a colleague sent me (https://twitter.com/llm_sec/status/1668711587287375876?s=20). I'm only a causal observer (not a Langchain user or contributor), but I thought it might be good to drop these links in case you're unaware of the ways that attackers can escape from AST-based Python "sandboxes":

https://hacktricks.boitatech.com.br/misc/basic-python/bypass-python-sandboxes https://github.com/mahaloz/ctf-wiki-en/blob/master/docs/pwn/linux/sandbox/python-sandbox-escape.md

The strategies in these links aren't exhaustive, but hopefully illustrate that this style of sandboxing makes attacks more complex without defeating them entirely.

qxcv avatar Jun 13 '23 23:06 qxcv

Thanks for the PR, @boazwasserman! The PAL chain is indeed unsafe. It seems you've got enough experience to be aware of the points that @qxcv (thanks for the links btw!) is making. I don't think we could really get to enterprise-level security purely via AST validations, even if that were our main focus.

My inclination is still to add these checks in to make it a bit harder to succeed in a naive prompt injection attack.

If someone were to want to use this chain in production, it ought to be isolated further as well.

To counter a false sense of security, we could log in the PythonREPL

import logging
import functools

logger = logging.getLogger(__name__)


@functools.cache.lru_cache(maxsize=0)
def warn_once() -> None:
    # Warn that the PythonREPL
    logger.warning("Python REPL can execute arbitrary code. Use with caution.")

(called in run) cc @hwchase17

vowelparrot avatar Jun 13 '23 23:06 vowelparrot

Thanks for the inputs! Completely agree that these ast validations & timeout limitations are not 100% bullet proof, but I still think they are worth having then not. Will add the logged warning as suggested.

boazwasserman avatar Jun 14 '23 06:06 boazwasserman

@orraz-labs is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Jun 18 '23 13:06 vercel[bot]

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 10, 2023 1:06pm

vercel[bot] avatar Jun 21 '23 06:06 vercel[bot]

Is there any updates?

image

As you can see, the vulnerability is not yet closed.

L0Z1K avatar Jul 10 '23 06:07 L0Z1K

@L0Z1K good catch! I was missing an edge case. Fixed it now

boazwasserman avatar Jul 10 '23 13:07 boazwasserman