langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Replace exec with wasm_exec

Open Jflick58 opened this issue 1 year ago • 6 comments

This replaces the usage of the insecure exec function with a more secure library called wasm_exec TL;DR: Use chroot jails, wasmtime and a standalone Python3.11 WASM interpreter to safely execute arbitrary code

Some questions before I officially submit this pr:

  • How critical is it that we support the execution of code containing 3rd-party dependencies? The team behind the wasm interpreter I vendor with the library is working on porting numpy along with some of the c libs needed to enable broader package support.

    • If it is critical, then how critical is it to support Python versions below 3.11? Right now the interpreter does support the use of an existing venv for accessing installed 3rd party deps, but the interpreter is only distributed in 3.11 which means I would either a) have to only support 3.11 b) have a 3rd split in the PythonRepl code base (base, AST, Wasm) or b) try to write some hacky and unpythonic code to try to dynamically create a new 3.11-based venv off of the currently active Python environment.

    • If it is not critical then is it okay if I refactor the test cases to avoid the tests containing numpy and pandas code? Should we also update the docs to reflect that limitation?

I also strongly welcome feedback on the wasm_exec library if there are implementation ideas that make it more functional.

Fixes #1026 Fixes #5294 Fixes #5388 Fixes https://nvd.nist.gov/vuln/detail/CVE-2023-29374

Who can review?

  • @hwchase17
  • @vowelparrot

Jflick58 avatar Jun 02 '23 23:06 Jflick58

Just my opinions here. I've not commit bit & am relatively new to langchain.

This replaces the usage of the insecure exec function with a more secure library called wasm_exec TL;DR: Use chroot jails, wasmtime and a standalone Python3.11 WASM interpreter to safely execute arbitrary code

100% needed, and thanks!

Some questions before I officially submit this pr:

  • How critical is it that we support the execution of code containing 3rd-party dependencies? The team behind the wasm interpreter I vendor with the library is working on porting numpy along with some of the c libs needed to enable broader package support.

It's better to get this in place first, and then worry about 3rd party deps later. The security concern massively outweighs feature depth.

  • If it is critical, then how critical is it to support Python versions below 3.11? Right now the interpreter does support the use of an existing venv for accessing installed 3rd party deps, but the interpreter is only distributed in 3.11 which means I would either a) have to only support 3.11 b) have a 3rd split in the PythonRepl code base (base, AST, Wasm) or b) try to write some hacky and unpythonic code to try to dynamically create a new 3.11-based venv off of the currently active Python environment.

I also think it's not worth a lot of effort to support power Python versions. (b) & (c) feel like maintenance burdens, and simplicity is important in security-minded components.

uogbuji avatar Jun 03 '23 01:06 uogbuji

@hwchase17 @vowelparrot would love some feedback on this PR. I recognize some of the tests are still failing, but I think that resolution is dependent upon answers to the questions I posed above. Thanks!

Jflick58 avatar Jun 06 '23 16:06 Jflick58

@hwchase17 @vowelparrot bumping again, hoping to get your thoughts...

Jflick58 avatar Jun 17 '23 19:06 Jflick58

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Nov 1, 2023 11:18pm

vercel[bot] avatar Jul 03 '23 05:07 vercel[bot]

@hwchase17 can you review the responses to the comments and let me know how you want to proceed?

Jflick58 avatar Jul 03 '23 05:07 Jflick58

any update on this feature/PR? Thanks.

timxieICN avatar Jul 11 '23 17:07 timxieICN

sorry for the delay here. high level thoughts:

  1. We almost certainly don't want to introduce this as a dependency for the main project
  2. We want to support as flexible of python running for prototyping purposes as possible
  3. We do not want people to be running this in production

I would suggest the following changes:

  1. We will move the current python tools and anything that depends on them into langchain.experimental and make that a separate package. This should remove CVEs from core langchain, while also hopefullying making it clear its experimental, and also allowing for prototyping
  2. I would encourage the creation of a separate langchain_python or the like respository where that uses wasm by default. we'd be happy to support this in a variety of ways (help setting up, help publicizing, help maintaining), but i think we'd prefer it to be a separate repository

thoughts?

hwchase17 avatar Jul 18 '23 06:07 hwchase17

Responded more in #8092

As far as this goes, I'll probably take these changes and implement as a separate package. How would that jive with the story of langchain.experimental though? Seems confusing to me to have an unsafe and safe way of doing things, both as external packages and possibly publicized by LangChain.

Jflick58 avatar Jul 22 '23 17:07 Jflick58

@Jflick58 Hi , could you, please, resolve the merging issues? After that ping me and I push this PR for the review. Thanks!

leo-gan avatar Sep 15 '23 01:09 leo-gan

@leo-gan I thought @hwchase17 decided to move the repl functionality to langchain-experimental to avoid including this security issue in the main langchain install? Happy to work on this PR to include the wasm-exec functionality in the main package, but wanted to confirm. Thanks!

Jflick58 avatar Sep 16 '23 20:09 Jflick58

@Jflick58 If this change suits experimental better, is it possible to start this change again as a new PR in experimental? If you want to do this, please, close this PR.

leo-gan avatar Sep 16 '23 23:09 leo-gan

Would it have to be a new PR? Given that langchain and langchain-experimental are both from this repository, could this PR be modified accordingly?

EliahKagan avatar Sep 16 '23 23:09 EliahKagan

It is up to you. This PR could be OK :+1:

leo-gan avatar Sep 17 '23 00:09 leo-gan

Hey guys, any updates for this PR?

RuanAzevedo avatar Oct 03 '23 15:10 RuanAzevedo

@RuanAzevedo This is still on my to-do list. I'm traveling for work this week but hoping to pick it back up next week.

Jflick58 avatar Oct 03 '23 16:10 Jflick58

hi @Jflick58, any update on this PR (to resolve this CVE)?

tabdunabi avatar Oct 10 '23 18:10 tabdunabi

Is there any update on this?

kenziewritescode avatar Feb 27 '24 22:02 kenziewritescode