codeflare
codeflare copied to clipboard
Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.
**Describe the bug** Cannot bring up Ray cluster as defined in the OCP tutorial **To Reproduce** Steps to reproduce the behavior: 1. Go to [https://codeflare.readthedocs.io/en/latest/getting_started/starting.html#Openshift-Ray-Cluster-Operator](https://codeflare.readthedocs.io/en/latest/getting_started/starting.html#Openshift-Ray-Cluster-Operator) 2. Run `pip3 install --upgrade...
Support better integration between Ray and Spark in passing ObjectRef without actually moving data
## Overview As a Codeflare user, I want to use Ray and Spark alternately to execute my end-to-end ML jobs. Some steps might be executed more efficiently using Ray, while...
## Overview As a CFP user, I would like to split a dataset (e.g., np array, pandas dataframe) into smaller objects that can then be fed into other nodes/pipeline. This...
## Overview As a CF pipelines user, support for nested pipelines, where the node of a pipeline can be a pipeline itself. ### Acceptance Criteria - [ ] Nested pipeline...
## Overview As a CF pipelines user, I would like to understand the memory consumption when pipelines are executed. Given pipelines accept nparrays, will zero copy sharing of Ray help?...
## Overview As a CF pipelines user, I would like the ability to select the best or k-best pipelines from a parameter grid search output. ### Acceptance Criteria - [...
## Overview AND node semantics computes a full cross product. In grid search cv, an AND node like feature union will require features to be joined in a given input...
## Overview As a CF pipelines developer, using node as a key as opposed to `node_name` causes a lot of overhead. An intrusive change, but will help keep all the...
## Overview As a CF pipelines user, I would like to see a ADR capturing the design of gridsearch CV. ### Acceptance Criteria - [ ] ADR for grid search...
## Overview Current implementation does not accept user specified scoring metric(s). For example, `cross_val_score(pipeline, X_test, y_test, scoring="neg_mean_squared_error", cv=10)` The list of sklearn model evaluation metrics are listed here https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter ###...