azure-docs
azure-docs copied to clipboard
Documentation doesn't work for consuming V1 tabular datasets on V2 jobs
The SDK2 example for consuming V1 tabular on sdk2 jobs does not work. I think the parameters are no longer up to date. Mode and type can be passed as strings; passing the dataset as path as shown returns an error.
It is also not clear how the .py script that will receive the data should ingest it.
Document Details
⚠ Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.
- ID: c56bf00c-ec47-4a12-9a87-137fb29893e0
- Version Independent ID: 86302e6a-7281-fb13-3702-63ea37622606
- Content: Access data in a job - Azure Machine Learning
- Content Source: articles/machine-learning/how-to-read-write-data-v2.md
- Service: machine-learning
- Sub-service: mldata
- GitHub Login: @ynpandey
- Microsoft Alias: yogipandey
@thegitofdaniel
Thanks for your feedback! We will investigate and update as appropriate.
@thegitofdaniel I have assigned this to content author @ynpandey to check and share his valuable insights on this.
Was this resolved, have issues passign an SDK v1 data asset (mltable) to an SDK v2 training job where the input data is specified as follows:
my_job_inputs = {
"input_data": Input(
type=AssetTypes.MLTABLE,
path=filedataset_asset,
mode=InputOutputModes.DIRECT
)
}
job = command(
inputs=my_job_inputs,
...
I get the error:
ml_client.create_or_update(job) Exception: [31m Error: 1) One or more fields are invalid Details: Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': ...
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:78, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
76 span_impl_type = settings.tracing_implementation()
77 if span_impl_type is None:
---> 78 return func(*args, **kwargs)
80 # Merge span is parameter is set, but only if no explicit parent are passed
81 if merge_span and not passed_in_parent:
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:333, in monitor_with_telemetry_mixin.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
331 dimensions = {**parameter_dimensions, **(custom_dimensions or {})}
332 with log_activity(logger, activity_name or f.__name__, activity_type, dimensions) as activityLogger:
--> 333 return_value = f(*args, **kwargs)
334 if not parameter_dimensions:
335 # collect from return if no dimensions from parameter
336 activityLogger.activity_info.update(_collect_from_return_value(return_value))
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py:563, in JobOperations.create_or_update(self, job, description, compute, tags, experiment_name, skip_validation, **kwargs)
561 except Exception as ex: # pylint: disable=broad-except
562 if isinstance(ex, (ValidationException, SchemaValidationError)):
--> 563 log_and_raise_error(ex)
564 else:
565 raise ex
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_exception_helper.py:277, in log_and_raise_error(error, debug, yaml_operation)
274 else:
275 raise error
--> 277 raise Exception(formatted_error)
Exception:
Error:
1) One or more fields are invalid
Details:
Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': None, 'type': 'mltable', 'is_anonymous': False, 'auto_increment_version': False, 'name': 'Diabetes data asset', 'description': 'Diabetes dataset as a data asset', 'tags': {'format': 'CSV'}, 'properties': {'v1_type': 'tabular'},
Mode should be eval_mount
#please-close