Distributed-CellProfiler
Distributed-CellProfiler copied to clipboard
Run encapsulated docker containers with CellProfiler in the Amazon Web Services infrastructure.
Adds WORKSPACE_BUCKET to DCP configuration allowing image files to be read from a separate bucket than non-image files (e.g. pipeline, load_data.csv, etc.). Closes #160 Also adds another demo project that...
I think we can do this, and it would be nice for more "experimental" setups
We'll need to figure out some clever logic to figure out whether it's an old or new structure, and/or some non-clever `try except` logic
Still a bit of a mess handling both roles and users. I believe what is currently pulled to run-worker.sh in `master` handles roles nicely but doesn't play well with user...
This works to read/write from external bucket using a profile pointing to an assumed role. Docker currently available at `erinweisbart/distributed-cellprofiler:readwrite_profile` Have not yet tested to see if I broke anything...
When downloading files (`DOWNLOAD_FILES=True`) off .csvs [`s3.meta.client.download_file(AWS_BUCKET,prefix_on_bucket,new_file_name)`](https://github.com/DistributedScience/Distributed-CellProfiler/blob/7baffda87ae85f3778d513e2d69e549996ebc091/worker/cp-worker.py#L215) if the file isn't in S3 then it should return a useful error message saying so. Currently silent
Given new plugin organization in CP-plugins repo, confirm that DCP doesn't need any corresponding changes.
Currently, load_data.csv is read from SOURCE_BUCKET. For a use case like Cell Painting Gallery where you're reading public images, would be nice to be able to set load_data.csv location to...
``` Transfer of CellProfiler logs to S3 initiated Traceback (most recent call last): File "run.py", line 786, in monitor() File "run.py", line 763, in monitor export_logs(logs, loggroupId, starttime, bucketId) File...