realm icon indicating copy to clipboard operation
realm copied to clipboard

Implement long running job control

Open hulto opened this issue 1 year ago • 9 comments

Is your feature request related to a problem? Please describe. Currently we have no way to kill threads (tasks) that are long running. We currently rely on the task to exit, however this isn't ideal especially if a task ties up a resource like a file or network port. To allow users to manually terminate long running tasks we should implement a:

  • task.list() -> List(int)
  • task.kill(id: int)

Describe the solution you'd like We need a way to facilitate communication between the interpreter Eldritch and the Agent imix / golem. Currently we communicate task output as a string over an mspc channel. If we abstract that to a GRPc byte stream we can pass arbitrary typed data Eg:

  • TextOutput - For command output
  • TaskList - List threads
  • TaskKill - Kill a thread

This could later also be expanded to supported typed output from Eldritch or other meta control task like updating the C2 callback URL.

Describe alternatives you've considered

  • Ditching eldritch and creating a special graphql type for job control I don't like this since it move away from our core function and would require we maintain two interfaces for the c2 framework.

Additional context N/a

hulto avatar Jun 13 '23 03:06 hulto

Thinking long term we may wish to re-use this pattern to pass typed return objects like File, Process, FireWallRule back to the C2 server. With that in mind re-using the graphql schema might be lower overhead long term.

The graphql objects just need to be serializeable using serde json and shared between imix and eldritch.

Probably makes sense to add this graphql API as a seperate project under lib would be nice to put it under tavern but I don't think that would make sense for the TaskList, and TaskKill objects since those won't have server-side meaning.

hulto avatar Jun 18 '23 20:06 hulto

Blocking while we sort out the the grpc migration #331

hulto avatar Aug 17 '23 00:08 hulto

  • New thought - Instead of doing IPC rpc - pass a shared object for the task list.
    • Create a builder "EldritchRuntimeBuilder"
      • An instance of EldritchRuntimeFunctions
      • An instance of GlobalsBuilder (Starlark) - or Globals.
      • build() returns:
        • EldritchRuntime
    • Create a struct called "EldritchRuntime"
      • run function
        • inputs
          • The stuff run takes now minus the printhandler. - Should pull from Self::...
    • Create a trait called "EldritchRuntimeFunctions"
      • functions
        • println
        • GetTasks
        • KillTask
    • Create a struct called "ImixEldritchRuntimeFunctions"
      • Actually implements the required functions for trait of "EldritchRuntimeFunctions"
      • Pass a Mutex to the task list into the struct creation.
        • imix::AsyncTask could move into the runtime
        • all_exec_futures hashmap would get Mutex'd and shared into the Eldritch Runtime functions.
        • This should be abstract and not actually a part of the generic type
          • Think if in the future the task list is synced over a channel instead of shared memory.
        • Start with a Rc<RefCell<T>> then move to Mutex if needed.
          • https://www.reddit.com/r/rust/comments/dihuwf/rc_and_refcell_vs_arc_and_mutex_in_nonshared/
          • Wonder if this will reduce or increase issues with borrowing since the mutex changes how the struct is modified from borrow to mutex locking.
    • Still unsure
      • From imix can we even kill running threads? - thread timeout isn't working rn so may need to solve that before we can actually kill those threads.

Example of how struct trait impl could look. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2fa3c39fc84af849ddaf9e55d91021d4

Example with composing traits. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e6d3a331cbe723a4a161befb3a750b9c

//
// Traits/Abstractions
//

struct Runtime<'a> {
    funcs: &'a dyn RuntimeFunc
}

trait RuntimeFunc: Blorper + Flooper {}

trait Flooper {
    fn floop(&self);
}

trait Blorper {
    fn blorp(&self);
}

//
// Implementations/Concretations
//


struct DopeRuntimeFunc {
    word: String
}

impl RuntimeFunc for DopeRuntimeFunc {}

impl Blorper for DopeRuntimeFunc {
    fn blorp(&self) {
        println!("blorp {}.", self.word);
    }
}

impl Flooper for DopeRuntimeFunc {
    fn floop(&self) {
        println!("floop {}.", self.word)
    }
}

//
// Execution
//

fn test(r: Runtime) {
    r.funcs.blorp();
    r.funcs.floop();
}

fn main() {
    let b = DopeRuntimeFunc {
        word: String::from("hi")
    };
    let r = Runtime {
        funcs: &b
    };
    test(r);
}

Here's what it could look like - ish.

#[derive(Builder, Debug)]
struct EldritchRuntime<'a> {
	globals_builder: starlark::GlobalsBuilder
	funcs: &'a dyn EldritchRuntimeFunctions
}
pub trait EldritchRuntimeFunctions {
	fn println() -> Result<()>;
	fn get_tasks() -> Result<()>;
	fn kill_task() -> Result<()>;
}
struct ImixEldritchRuntimeFunctions {
	task_list: Mutex<TaskList>;
}

impl EldritchRuntimeFunctions for ImixEldritchRuntimeFunctions {
    fn println(&self, text: &str) -> anyhow::Result<()> {
        println!("{}", text.to_owned());
        Ok(())
    }
    fn get_tasks(&self) -> anyhow::Result<...> {
	    return self::task_list;
    }
    fn kill_task(id: TaskID) -> anyhow::Result<...> {
    }
}

hulto avatar Dec 16 '23 22:12 hulto

cooler looking example:

//
// Traits/Abstractions
//

struct Runtime<'a, T: RuntimeFunc> {
    peep: String,
    funcs: &'a T
}

trait RuntimeFunc: Blorper + Flooper {}

trait Flooper {
    fn floop(&self);
}

trait Blorper {
    fn blorp(&self);
}

impl<T: RuntimeFunc> Runtime<'_, T> {
    fn shmoop(&self) {
        println!("shmoop {}.", self.peep);
    }

    fn run(&self) {
        self.funcs.blorp();
        self.funcs.floop();
        self.shmoop();
    }
}

//
// Implementations/Concretations
//


struct DopeRuntimeFunc {
    word: String
}

impl RuntimeFunc for DopeRuntimeFunc {}

impl Blorper for DopeRuntimeFunc {
    fn blorp(&self) {
        println!("blorp {}.", self.word);
    }
}

impl Flooper for DopeRuntimeFunc {
    fn floop(&self) {
        println!("floop {}.", self.word)
    }
}

//
// Execution
//

fn main() {
    // Instantiate.
    let b = DopeRuntimeFunc {
        word: String::from("foo")
    };
    let r = Runtime {
        peep: String::from("bar"),
        funcs: &b
    };

    // Run!
    r.run();
}

Cictrone avatar Dec 16 '23 23:12 Cictrone

This is blocked until we implement a way to cancel async tasks. Probably won't be perfect as we have to reimplement contexts and be responsible for checking them and not all async tasks like listening on a port respect those.

hulto avatar Feb 13 '24 01:02 hulto

Currently we use in spawn_blocking in implants/lib/eldritch/src/runtime/eval.rs

pub async fn start(id: i64, tome: Tome) -> Runtime {
    let (tx, rx) = channel::<Message>();

    let env = Environment { id, tx };

    let handle = tokio::task::spawn_blocking(move || {
...

Link to docs about why this breaks killing threads: https://dtantsur.github.io/rust-openstack/tokio/task/fn.spawn_blocking.html#:~:text=Closures%20spawned%20using%20spawn_blocking%20cannot,them%20after%20a%20certain%20timeout.

Why do we need to use spawn_blocking?

Could something like crossbeam give us the thread features like try_recv we need?

hulto avatar Jul 10 '24 03:07 hulto

Could we use the underlying system API to force the thread closed during blocking operations?

https://stackoverflow.com/questions/26199926/how-to-terminate-or-suspend-a-rust-thread-from-another-thread#comment135622530_26200583

hulto avatar Jul 10 '24 03:07 hulto

Seems like sliver closes up The connection to terminate the thread.

https://github.com/BishopFox/sliver/blob/master/implant/sliver/forwarder/socks.go#L86

hulto avatar Jul 10 '24 04:07 hulto

Could we have every eldritch function define a close function.

When the function gets called have it place a function pointer to its close function and any relevant handles on a shared queue.

When a tome is aborted work backwards through the queue calling close functions.

How would we handle things that have closed out naturally?

hulto avatar Jul 10 '24 04:07 hulto