[Question]: Tool argument handling philosophy
What is your question?
I noticed that all tool arguments get automatically quoted via shlex.quote() (in body.j2), which I assume is for security to prevent command injection from unpredictable LLM output..
However, I'm curious about your thoughts on CLI tools that expect multiple separate arguments rather than single quoted strings. For example:
Works great:
- adb shell '{{ command }}' ✓
- ssh host '{{ command }}' ✓
- curl '{{ url }}' ✓
Challenging:
- httpx {{ args }} where args = "-u target.com -silent -json"
- Becomes: httpx '-u target.com -silent -json' (single argument, breaks)
The key context: We're trying to let the LLM itself intelligently generate the appropriate command arguments based on the task, rather than hardcoding them ourselves. A future orchestrator might give the same httpx tool completely different tasks requiring different argument patterns.
Current workarounds:
- Split parameters: httpx -u {{ target }} {{ flags }} (works but limits LLM flexibility)
- Use shell namespace (loses structured tool interface)
- Write Python tools (defeats YAML simplicity)
I totally understand prioritizing security over flexibility.. But I'm curious: do you see this as an acceptable limitation for dynamic LLM command generation, or have you considered patterns for tools that genuinely need argument separation? Maybe something like a raw_args: true flag for trusted scenarios?
Just thinking out loud on this..
I think you just need the shell namespace if you want the model to figure out arguments independently
Any counter argument or can I close this? :)
No, you're right. That is the best way currently.
For example, check this out: https://github.com/evilsocket/nerve/blob/main/examples/android-agent/agent.yml#L28
I could have implemented 20000000 specialized tools for everything adb can do (and it's a lot), but then figured i could just give adb shell to the model and the training data would do the rest - and it did! :D i've learned stuff about adb shell via this agent and how the various models used the tool that i didn't know :D bash/zsh/etc, i assume, have a much stronger presence in the training dataset (since less specialized than adb), so I can only assume that property would transfer even better. And it does. There are agents that I wrote privately and can't disclose, that perform very good penetration testing if you just give them a simple shell tool in a kali container, instead of wrapping every single thing you could possibly think of :D
Alternatively, if you really really want to do that, you can use the integration with robopages and have stuff like this https://github.com/dreadnode/robopages/blob/main/cybersecurity/offensive/information-gathering/amass.yml
I hope this adds more context to my perspective. Just a lot of playing and experience :D cheers!