maestro feat: Add support for running shell commands within Maestro workflows

Proposed Changes

This pull request introduces support for running shell commands within Maestro workflows. Key updates include:

New action enabling the execution of arbitrary shell commands in flows.
Updated documentation and examples illustrating shell command usage.
Safeguards for handling command output, errors, and exit codes.
Tests ensuring robust and secure shell command integration.

Testing

Added and executed unit and integration tests for the new action.

Issues Fixed

N/A (This is a feature addition; no specific issues are referenced.)

Usage Example

appId: com.example.app
name: RunShellCommand Demo
---
- runShellCommand:
    command: echo "Hello, Maestro!"
    label: Echo a greeting
    timeout: 1000
    outputVariable: OUTPUT_VAR

- evalScript: ${console.log("runShellCommand output - " + OUTPUT_VAR)}
- assertTrue:
    condition: ${OUTPUT_VAR == "Hello, Maestro!"}
    label: Assert runShellCommand output is correct

Interface

command (required): The command to execute.
env (optional): Environment variables to add to the execution environment.
timeout (optional): Time to wait for the task to finish. Wait indefinitely, if no timeout is specified
outputVariable (optional): Stores output in the specified variable; defaults to COMMAND_LINE_OUTPUT if not provided.
workingDirectory (optional): Working directory for the command; defaults to the current directory.
Supports common when, optional, and label directives.

Notes

Issues and features can be resolved with the PR: https://github.com/mobile-dev-inc/maestro/issues/1250 https://github.com/mobile-dev-inc/Maestro/issues/1682 https://github.com/mobile-dev-inc/Maestro/issues/2579 https://github.com/mobile-dev-inc/Maestro/issues/2355 https://github.com/mobile-dev-inc/Maestro/issues/2087 https://github.com/mobile-dev-inc/Maestro/pull/2507 And many more.

Jun 27 '25 05:06 dipcore

The PR is ready for review. Guys, please take a look on it.

Jul 03 '25 17:07 dipcore

I'm considering that this could be a risk to Maestro Cloud infrastructure. I'm wondering how we can limit that risk. Got thoughts?

Jul 03 '25 19:07 Fishbowler

I'm considering that this could be a risk to Maestro Cloud infrastructure. I'm wondering how we can limit that risk. Got thoughts?

You’re right, this change could potentially introduce security risks for the Maestro Cloud infrastructure. However, it also enables new capabilities that aren’t possible with just local or alternative cloud environments. To help with these risks, I believe an additional argument or environment variable could be added, ensuring it’s only enabled when explicitly intended. (i.e. MAESTRO_ALLOW_UNSECURE_DIRECTIVES ?)

What do you think about it?

Jul 03 '25 20:07 dipcore

Would users respond well to a thing that works locally but doesn't work in the Cloud execution environment though?

Jul 03 '25 21:07 Fishbowler

Would love your take on this @Leland-Takamine

Jul 03 '25 21:07 Fishbowler

Fixes #1682 ?

Jul 09 '25 13:07 Fishbowler

@dipcore : That would be nice.

To be able to run a command like this:

maestro hierarchy and cat > db.json <<'EOJ'\n${JSON.stringify(output.result, null, 4)}\nEOJ

I had to create a server in nodejs that needs to be run before starting the test.

A bit of a hassle.

Jul 10 '25 13:07 Angelk90

Here is my vision:

PROS:

Integration with External Tools and Scripts: Enables integration of tools like node, adb, xcrun, and any commands/custom scripts or servers (mock providers, proxies, report generators, etc.).
Advanced Debugging and Reporting: fetch logs, run diagnostics, collect artifacts, or trigger external reporting tools as part of the workflow, making debugging and monitoring much easier.
Enhanced Error Handling: Direct shell access allows for custom crash recovery, cleanup routines, or conditional logic that would be complex to express purely in Maestro commands.
Community-Driven Use Cases: Many feature requests and workarounds (see Maestro’s issues and PRs, e.g. https://github.com/mobile-dev-inc/Maestro/pull/2507) can now be addressed instantly by users themselves.
Boosts Framework Attraction: With frameworks like Expo officially promoting Maestro (see Expo’s docs), this level of flexibility will attract even more users and organizations.
Virtually Limitless Extensibility (!)

CONS:

Security Concerns: Executing shell commands poses a risk, especially in shared/cloud environments (Maestro Cloud). This can be resolved by allowing administrators to enable/disable shell access via env variables or Maestro CLI flags.
Platform-Specific Behavior: Shell commands may behave differently on macOS, Linux, or Windows, which could reduce test portability. However, experienced CI/CD users are used to this.
Complexity and Maintenance: Such flexibility also introduces responsibility. Poorly designed scripts can lead to test flow hangs, high resource consumption, or unintended side effects.

In my view, the advantages far outweigh the disadvantages. The biggest concern lies in Maestro Cloud’s security and its built-in cloud functionality. However, as noted, security risks can be resolved by allowing shell access to be enabled or disabled in the cloud environment. Optionally, implementing script whitelisting and/or providing a preinstalled set of safe scripts on Maestro Cloud could eliminate these risks.

Let me know what do you think about it.

Jul 10 '25 16:07 dipcore

Guys, let me know if there is anything I need to add to the PR.

Jul 16 '25 03:07 dipcore

RunShellCommand is exactly what i'm looking for in maestro +1 & ThumbsUp

Jul 23 '25 08:07 tomzeni

I'm not sure we're going to be able to merge this PR in its current state.

I had concerns about security that aren't entirely mitigated by the flag, since anyone using these features would want the flag on, and anyone running the service never would. I think this points at the larger problem. Maestro tests are inherently portable. An Android test that runs on Windows will also run on a Mac or on Linux. The flags are a small barrier to portability, but once we introduce shell commands, that stops becoming true for lots of cases. A shell command that works on one machine won't necessarily run on another without certain dependencies already having been met.

This isn't a bad PR, and I'd definitely have found it useful in circumstances, especially in those cases where I'm using Maestro as a task automation tool rather than a testing tool. I don't know that we can get this to right state of portability to make it compatible with the project.

WDYT?

Jul 23 '25 10:07 Fishbowler

I believe adding shell command support makes Maestro significantly more versatile and capable across a wide range of use cases.

I completely understand the security concerns - but enabling this feature via environment variables for those who explicitly need it is a proven and reasonable approach.

Regarding portability, most teams are already used to maintaining and tweaking their test cases throughout an app’s lifecycle. And more often than not, that involves using shell-based tools like adb, node, and others.

So if you're running tests on Windows, macOS, or cloud-based machines, you're likely already well aware of your environment and its constraints.

In most cases, this level of environment awareness - along with experience managing platform-specific differences - is simply part of the workflow.

IMHO, adding shell command support is exactly the kind of feature that turns Maestro into a real game-changer for many of us.

Jul 24 '25 11:07 tomzeni

@tomzeni : I agree.

Currently, you can prevent shell commands from being used via cloud environment variables.

@dipcore: I was wondering, for example, if I start a NodeJS server running locally: node server.js

Will the server shut down after the next step? Will it stay open?

appId: com.example.app
name: RunShellCommand Demo
---
- runShellCommand:
    command:  node ../server/server.js
    label: Open server
    timeout: 1000
    outputVariable: OUTPUT_VAR
    
    #Server already closed? 
- evalScript: '${console.log(JSON.stringify(OUTPUT_VAR))}'

Jul 24 '25 12:07 Angelk90

I'd expect that in the current implementation for the second step to never start. The script would never complete.

Without changes, this can't be merged. My comments above need addressing. How could complete portability be achieved?

Jul 24 '25 21:07 Fishbowler

@Angelk90

@dipcore: I was wondering, for example, if I start a NodeJS server running locally: node server.js

Will the server shut down after the next step? Will it stay open?

@Fishbowler

I'd expect that in the current implementation for the second step to never start. The script would never complete.

It will work exactly as you would start it from command line. I suppose server.js is a some service and you would like to start and keep it running in the background? so in that case just do node server.js & If you want more advanced usage do nohup node server.js > output.log 2>&1 &

nohup keeps the process running even after shell is terminated
> output.log 2>&1 output and error redirected to output.log
& run process in the background

Jul 25 '25 03:07 dipcore

@Angelk90 Even more advanced use-case. Imagine you want to run a process in the background when flow starts. And you want to kill it when the flow is complete. This can be done like:

appId: com.example.app
name: Flow with a background server
---
- runShellCommand:
    command: nohup node server.js > output.log 2>&1 & echo $!
    label: Start server
    timeout: 1000
    outputVariable: SERVER_PID
    
- evalScript: ${console.log("Server process has been started. PID:  " + SERVER_PID)}

- more_steps_here

- runShellCommand:
    command: kill -9 $SERVER_PID
    label: Stop server
    timeout: 1000

NB you can have it all in bash scripts. like start_service.sh, stop_service.sh. You can also store process pid in a file like service.pid. With command line you could send email (or slack, discord, teams, etc) reports when flow is complete or fails. Actually tons of options here. Command line is so flexible, it is up to you how to implement and use it.

Jul 25 '25 03:07 dipcore

@dipcore : Maybe I'd prefer it that way.

./manage_service.sh start ../server/server.js

manage_service.sh

#!/bin/bash

start_server() {
  if [ -z "$1" ]; then
    echo "Error: No file name provided. Usage: ./manage_service.sh start <file_name>"
    exit 1
  fi

  FILE_NAME=$1

  nohup node "$FILE_NAME" > output.log 2>&1 &
  echo $! > service.pid
  echo "Server process has been started using $FILE_NAME. PID: $(cat service.pid)"
}

stop_server() {
  if [ ! -f service.pid ]; then
    echo "No running server found."
    exit 1
  fi
  SERVER_PID=$(cat service.pid)
  kill -9 $SERVER_PID
  rm -f service.pid
  echo "Server process with PID $SERVER_PID has been stopped."
}

status_server() {
  if [ -f service.pid ]; then
    echo "Server is running with PID: $(cat service.pid)"
  else
    echo "No server is running."
  fi
}

show_help() {
  echo "Usage: $0 {start|stop|restart|status|help} [file_name]"
  echo "Commands:"
  echo "  start    - Start the server and save its PID in service.pid. File name is required."
  echo "  stop     - Stop the server using the PID from service.pid."
  echo "  restart  - Stop and then restart the server. File name is required for restart."
  echo "  status   - Check if the server is running and display its PID."
  echo "  help     - Show this help message."
}

case "$1" in
  start)
    start_server "$2"
    ;;
  stop)
    stop_server
    ;;
  restart)
    stop_server
    start_server "$2"
    ;;
  status)
    status_server
    ;;
  help)
    show_help
    ;;
  *)
    echo "Invalid option. Use '$0 help' for usage information."
    exit 1
    ;;
esac

Being able to do something like this would be great. You could use maestro not just for apps, but also for testing API and more.

@dipcore: What do you think about @Fishbowler's concerns? Do you think you can find a solution to support this new command?

Jul 25 '25 07:07 Angelk90

@Fishbowler

Without changes, this can't be merged. My comments above need addressing. How could complete portability be achieved?

This is a complex question, especially for contributors who are not decision makers, but I will do my best to answer.

The Maestro YAML syntax can be divided into two main parts:

Communicating with the emulator/simulator to execute the test case.
This is the core purpose of the Maestro tool. This part is currently very well implemented: it is portable, the command syntax is clear and intuitive, and it works as expected. This aspect is excellent.
Interacting with the environment (i.e., the operating system) where the emulator/simulator, Maestro, and other tools are running.
This part is mainly for supporting tests, setting up the environment, reporting, running HTTP mock servers, etc. However, this area is not as well implemented. There is only a small set of commands for executing OS-level actions, and most of these require a Maestro Cloud Plan (for example: email notifications). This means there is already some functionality that is not portable.

Most notably, having shell access seems to be a highly requested feature (at least based on my research), but currently people have to use unofficial, hacky solutions such as running a small HTTP server (usually with Node or Python), check this gist. This approach clearly breaks portability by default.

Therefore, why artificially limit the user experience? Why force users to rely on unofficial methods instead of providing an official, well-tested, and supported way to achieve this?

Jul 29 '25 14:07 dipcore

There are some Cloud-specific bits, but those still work equally well on Windows as Mac as Linux.

I don't disagree that folks want the feature, however that doesn't negate that this isn't ready for merge in its current state.

Jul 29 '25 15:07 Fishbowler

@Fishbowler

I don't disagree that folks want the feature, however that doesn't negate that this isn't ready for merge in its current state.

Could you please clarify (and suggest) what specific updates or changes are required to make this ready for review and merging?

Jul 29 '25 15:07 dipcore

I think this definitely needs consideration and conversation before we turn to changing code.

I assume there are a whole bunch of different approaches that could make for a portable solution.

Define an ability to provide equivalent scripts on a per-platform basis.
Instead of direct scripting, do something through the JVM, leaning on it having already solved a lot of the cross-platform problems
Look at what we've already got in GraalJS and work out how host access can be done safely there (then portability becomes a developer concern rather than a Maestro concern).

These are examples, maybe you have other ideas that achieve something similar?

Jul 29 '25 18:07 Fishbowler

@Fishbowler Thank you for your response. To summarize your points, we need to address the following:

The default Maestro test flow syntax should be portable (as it is now).
Ensuring portability should be the responsibility of the developer, not Maestro.

If that’s the requirement, I actually have an idea.

Jul 29 '25 20:07 dipcore

@Fishbowler Thank you for your response. To summarize your points, we need to address the following:

The default Maestro test flow syntax should be portable (as it is now).

Ensuring portability should be the responsibility of the developer, not Maestro.

If that’s the requirement, I actually have an idea.

Hello, thanks for this, any ETA on when this will be ready?

Sep 11 '25 18:09 alexkater

@alexkater

I'll be honest - I wouldn't hold your breath that there is a real way to make this compatible with Maestro's ethos. To create a native scripting solution that's also entirely portable feels somewhat impossible.

Sep 11 '25 18:09 Fishbowler

@alexkater @Fishbowler

Some time ago, I created another PR (https://github.com/mobile-dev-inc/Maestro/pull/2615), which introduces a plugin system to Maestro.

That PR would satisfy all the desired requirements: the Maestro core would remain untouched, stable and cross-platform, etc. basically all the concerns discussed here. This makes the Maestro core extendable by third-party plugins (i.e. community-based). The Maestro team could also control and provide their exclusive 'cloud-only' functionality based on their closed-source plugins.

These plugins could provide everything you need, including the shell-execution plugin. I already have a prototype of this plugin, which has the same interface as this PR. There is no need for the Maestro team to support it or worry about security issues on the cloud.

Unfortunately, I did not receive many comments about the proposed changes. I tried filling in the Slack form to get in touch with the Maestro team and discuss the changes. I also tried contacting Leland via email, but have not received a response yet.

I would really like to contact someone from the team.

Sep 12 '25 00:09 dipcore

Contact someone from the team happens here or in Slack (unless you pay for Enterprise support)

I understand that a small team attempting to make a living might not move as fast as you'd like on the community PRs. If you solve the problem of funding open source in a capitalist world, there's a million people who'd like the answer 😂

Sep 12 '25 06:09 Fishbowler

+1 This is the only reason we're most likely not going to be going with Maestro at this point.

If this could just be locked down to running on local and CI/CD environments this would be 👨‍🍳

Oct 09 '25 01:10 jbarby-gpsw

I'm evaluating Maestro for my React Native team as a personal side project. I don't want to imply any expectations, I just want to throw my skirt in the ring to say that I'd be swayed towards committing to Maestro if we had the possibility of flows supporting shell commands / arbitrary script execution.

For context: I am currently exploring the potential of running 6ish flows using the Maestro CLI in GitHub Actions. We don't currently have an expectation of using Maestro Cloud.

We'd like to reduce the "logic" of our testing / CI pipeline in general so it would be helpful to encapsulate the invocation of a mobile-platform-agnostic setup/reset script in a nested flow used by our mobile-platform-specific tests. It would be ideal to isolate this script's concern to the testing flow in our Maestro code. Without being able to invoke a script in a shared flow, we'd have to orchestrate running this setup/reset script between individually invoking Maestro flows and then synthesize the results. That'd be an ergonomic blocker for us using Maestro.

Having previously been a Windows (+ nix dabbler) dev and now sort of attached to MacOS by iOS necessity, I appreciate the ideal of portability. I think I'm just confused how portability is a blocker for introducing an advanced "escape hatch" api when this seems very much like an opt-in, user-space concern.

I don't know if this is a Cloud marketing concern where you don't want to bifurcate cli/cloud functionality, but I think it's extremely reasonable for Maestro docs to say "The evalScript directive is not available on Maestro Cloud due to security constraints. If you need to execute your own code, consider doing it between executions of Maestro flows."

Thanks @dipcore for implementing and driving this concern (and the plugin pitch!) and the Maestro team for the tool <3

EDIT: I'll consider pinning our Maestro experiment to @dipcore's fork branch here in case an exemplary usage would be helpful to evaluate the suggested API

Nov 08 '25 03:11 FLGMwt

Guys, what can I do to make it merged?

Dec 02 '25 16:12 dipcore