metaflow
metaflow copied to clipboard
Support running flow (containing @batch steps) locally instead of AWS batch
Hi there, thank you again for the great work on this library :-)
I understand that the @batch
decorator allows us to selectively run some steps locally and some on AWS Batch. (docs)
There are times, however, where I want to run my entire flow locally on a small slice of the dataset to get fast feedback that my flow is still working. Currently, I have to manually comment out the @batch
decorator in order to do that.
I was wondering if there's a way for me to tell metaflow to ignore the @batch
decorator? For example,
# suggestion: this could ignore the `@batch` decorator and run entire flow locally
python myflow.py run --with local
# this will work as it currently does, and run steps with `@batch` on AWS Batch
python myflow.py run
Happy to hear your thoughts. Thanks again!!
Yes, indeed! You can use the @resources
decorator and then use --with batch
on CLI - although that will execute the entire flow on AWS Batch. There is an open issue for supporting @local
as well. Another alternative is to write a simple Python decorator that can add @batch
decorator to your step depending on the presence of an environment variable. I think I have an example handy somewhere - let me dig that up.
#350
Thanks for your prompt response @savingoyal !
Another alternative is to write a simple Python decorator that can add @batch decorator to your step depending on the presence of an environment variable. I think I have an example handy somewhere - let me dig that up.
- Could I trouble you to share an implementation of this?
I have done something similar to what @savingoyal suggested that you may find useful @davified :
from metaflow import batch as mf_batch
from metaflow import step
BATCH_LOCAL_MODE_ENV_VAR = 'BATCH_LOCAL_MODE'
def batch(*args, **kwargs):
if os.environ.get(BATCH_LOCAL_MODE_ENV_VAR, None) == '1':
sys.stderr.write('@batch operating in local development mode\n')
return step
else:
sys.stderr.write(
f'@batch operating in remote mode. Set environment variable {BATCH_LOCAL_MODE_ENV_VAR}=1 to switch to '
'local development mode\n'
)
return mf_batch(*args, **kwargs)
You can put this in a file called local_batch.py
and it is a drop-in replacement for @batch
from metaflow
, e.g. all you should need to do is replace
from metaflow import batch
with something like
from local_batch import batch
another thread for context and +1 https://outerbounds-community.slack.com/archives/C02116BBNTU/p1651716562792139