clearml
clearml copied to clipboard
Cannot use add_function_step for pipeline without optional parameter function_return
Hi, based on documentation for add_function_step:
function_return (Optional [ List [ str ] ] ) – Provide a list of names for all the results. If not provided no results will be stored as artifacts.
I wanted to skip this param and don't store any artifacts in my task. Unfortunately when I skipped this param in step definition, pipeline failed.
pipe = PipelineController(
name='UsingFunction',
project='Test',
version='0.0.1'
)
pipe.add_function_step(
name='preprocessing',
function=preprocessing
)
Error:
Traceback (most recent call last):
File "pipeline_func.py", line 405, in <module>
pipe.add_function_step(
File "/home/vscode/.local/lib/python3.8/site-packages/clearml/automation/controller.py", line 616, in add_function_step
if step in self._nodes and artifact in self._nodes[step].return_artifacts:
TypeError: argument of type 'NoneType' is not iterable
I tried with dummy solution in controller.py:
if step in self._nodes:
if self._nodes[step].return_artifacts:
if artifact in self._nodes[step].return_artifacts:
function_input_artifacts[k] = "${{{}.id}}.{}".format(step, artifact)
continue
Pipeline started work, but output models are store automatically as artifact anyway.
Hey We would need some more info about your config in order to reproduce the issue. Can you please give me your clearml version ? If you have some more examples about your pipeline usage, it could also help
Hi, Everything is working on GCP. To create ClearML Server I used your base image created for GCP. Agents on my own machines:
clearml 1.4.0
clearml-agent 1.2.3
Agent config printed during start:
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server = http://10.X.Y.Z:8008
api.web_server = http://10.X.Y.Z:8080
api.files_server = http://10.X.Y.Z:8081
api.credentials.access_key = VERYSECRET
api.host = http://10.X.Y.Z:8008
agent.worker_id = gcp:2
agent.worker_name = gcp:2
agent.force_git_ssh_protocol = true
agent.python_binary =
agent.package_manager.type = pip
agent.package_manager.pip_version = <20.2
agent.package_manager.system_site_packages = true
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.torch_nightly = false
agent.venvs_dir = /home/lukasz/.clearml/venvs-builds.1
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/lukasz/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/lukasz/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/lukasz/.clearml/pip-cache
agent.docker_apt_cache = /home/lukasz/.clearml/apt-cache.1
agent.docker_force_pull = false
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = /root/.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = /root/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.git_user =
agent.default_python = 3.9
agent.cuda_version = 0
agent.cudnn_version = 0
I believe that this one example will reproduce the issue. In my pipeline I have multiple steps and I am using different parameters combination. Every time lack of function_return
parameter generated error.
Hi @lpogo! Just wanted to let you know that we couldn't reproduce the issue as of now. If you find the reason why this doesn't work for you in the meantime, could you please let us know? Thank you!
Hi @eugen-ajechiloae-clearml
I just created my environment locally (Docker Desktop on WSL - Ubuntu 20.04)
Environment started from your docker compose file - so everything in the newest available version.
In my development container I have installed clearml==1.4.1
.
My script looks like:
from clearml import PipelineController
def preprocessing(dataset_id: str, dataset_project: str, dataset_name: str,dataset_filename: str):
print('step preprocessing')
# import libs
from clearml import Dataset
import pandas as pd
if dataset_id:
dataset_path = Dataset.get(
dataset_id=dataset_id
).get_local_copy()
else:
dataset_path = Dataset.get(
dataset_project=dataset_project,
dataset_name=dataset_name
).get_local_copy()
filepath = f'{dataset_path}/{dataset_filename}'
print(f'dataset_path: {filepath}')
df = pd.read_csv(filepath, low_memory=False)
df.drop('Time', axis=1, inplace=True)
return df
def prepare_config(data_frame, parameters):
print('step prepare_config')
# secret stuff here
if __name__ == '__main__':
pipe = PipelineController(
name='UsingFunction',
project='ClearMLTest',
version='0.1.1',
add_pipeline_tags=False,
)
pipe.set_default_execution_queue('MyQueue')
# Dataset parameters
pipe.add_parameter(name='dataset_id',description='Id of dataset task', default=None)
pipe.add_parameter(name='dataset_project',description='Dataset project name', default='FraudDetection/Datasets/monthly')
pipe.add_parameter(name='dataset_name',description='Name of dataset task', default='2022_02')
pipe.add_parameter(name='dataset_filename',description='Name of the destination file', default='February_2022.csv')
pipe_parameters = pipe.get_parameters()
pipe.add_function_step(
name='preprocessing',
function=preprocessing,
function_kwargs=dict(dataset_id='${pipeline.dataset_id}', dataset_project='${pipeline.dataset_project}', dataset_name='${pipeline.dataset_name}', dataset_filename='${pipeline.dataset_filename}'),
function_return=['data_frame'],
cache_executed_step=True,
)
pipe.add_function_step(
name='prepare_config',
parents=['preprocessing'],
function=prepare_config,
function_kwargs=dict(data_frame='${preprocessing.data_frame}', parameters=pipe_parameters),
# function_return=[],
cache_executed_step=True,
)
pipe.start(queue='MyQueue')
#pipe.start_locally()
print('process completed')
Then I start script on my development machine:
vscode@9598f9122548:/workspaces/scripts$ python3 pipeline_func.py
ClearML Task: created new task id=55b0e31a2a4a4aadb42c8e6a67815741
ClearML results page: http://clearml-webserver:80/projects/dd42a80e05bf48ec8692680c6a44aeaa/experiments/55b0e31a2a4a4aadb42c8e6a67815741/output/log
2022-06-07 12:34:20,865 - clearml.Task - INFO - No repository found, storing script code instead
ClearML pipeline page: http://clearml-webserver:80/pipelines/dd42a80e05bf48ec8692680c6a44aeaa/experiments/55b0e31a2a4a4aadb42c8e6a67815741
Traceback (most recent call last):
File "pipeline_func.py", line 421, in <module>
pipe.add_function_step(
File "/usr/local/envs/rapids-22.02/lib/python3.8/site-packages/clearml/automation/controller.py", line 616, in add_function_step
if step in self._nodes and artifact in self._nodes[step].return_artifacts:
TypeError: argument of type 'NoneType' is not iterable
Hi @lpogo I cannot reproduce your issue, despite that I used a similar code than the one you provided, with the same clearml and agent version. Can you also send your server version ? (you can find it on the settings page, on the webapp)
HI @DavidNativ,
WebApp: 1.5.0-192 • Server: 1.5.0-192 • API: 2.18
Hi @lpogo,
I modified your code a tiny bit (removed the file processing and just returned an empty pd in preprocessing, and modified the dataset_id default value - which BTW, when it's none failed for me). And this seems to work....
from clearml import PipelineController
def preprocessing(dataset_id: str, dataset_project: str, dataset_name: str, dataset_filename: str):
print(dataset_id, dataset_project, dataset_name, dataset_filename)
print('step preprocessing')
# import libs
from clearml import Dataset
import pandas as pd
df = pd.DataFrame()
return df
def prepare_config(data_frame, parameters):
print('step prepare_config')
# secret stuff here
if __name__ == '__main__':
pipe = PipelineController(
name='UsingFunction',
project='ClearMLTest',
version='0.1.1',
add_pipeline_tags=False,
)
pipe.set_default_execution_queue('1xGPU')
# Dataset parameters
pipe.add_parameter(name='dataset_id', description='Id of dataset task', default='1234')
pipe.add_parameter(name='dataset_project', description='Dataset project name', default='FraudDetection/Datasets/monthly')
pipe.add_parameter(name='dataset_name', description='Name of dataset task', default='2022_02')
pipe.add_parameter(name='dataset_filename', description='Name of the destination file', default='February_2022.csv')
pipe_parameters = pipe.get_parameters()
pipe.add_function_step(
name='preprocessing',
function=preprocessing,
function_kwargs=dict(dataset_id='${pipeline.dataset_id}', dataset_project='${pipeline.dataset_project}', dataset_name='${pipeline.dataset_name}',
dataset_filename='${pipeline.dataset_filename}'),
function_return=['data_frame'],
cache_executed_step=True,
)
pipe.add_function_step(
name='prepare_config',
parents=['preprocessing'],
function=prepare_config,
function_kwargs=dict(data_frame='${preprocessing.data_frame}', parameters=pipe_parameters),
# function_return=[],
cache_executed_step=True,
)
pipe.start(queue='1xGPU')
#pipe.start_locally(run_pipeline_steps_locally=True)
print('process completed')
Can you give it a go and tell me if it works for you?
and just to make double check, can you please try installing 1.4.2rc1 (pip install clearml==1.4.2rc1)? There was a fix surrounding this area which might magically solve this issue :)
Thanks! :)