azure-docs Documentation doesn't work for consuming V1 tabular datasets on V2 jobs

The SDK2 example for consuming V1 tabular on sdk2 jobs does not work. I think the parameters are no longer up to date. Mode and type can be passed as strings; passing the dataset as path as shown returns an error.

It is also not clear how the .py script that will receive the data should ingest it.

Document Details

⚠ Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

ID: c56bf00c-ec47-4a12-9a87-137fb29893e0
Version Independent ID: 86302e6a-7281-fb13-3702-63ea37622606
Content: Access data in a job - Azure Machine Learning
Content Source: articles/machine-learning/how-to-read-write-data-v2.md
Service: machine-learning
Sub-service: mldata
GitHub Login: @ynpandey
Microsoft Alias: yogipandey

Dec 14 '22 16:12 thegitofdaniel

@thegitofdaniel

Thanks for your feedback! We will investigate and update as appropriate.

Dec 14 '22 16:12 RamanathanChinnappan-MSFT

@thegitofdaniel I have assigned this to content author @ynpandey to check and share his valuable insights on this.

Dec 15 '22 11:12 Naveenommi-MSFT

Was this resolved, have issues passign an SDK v1 data asset (mltable) to an SDK v2 training job where the input data is specified as follows:

my_job_inputs = {
    "input_data": Input(
            type=AssetTypes.MLTABLE, 
            path=filedataset_asset,
            mode=InputOutputModes.DIRECT
    )
}

job = command(
    inputs=my_job_inputs,
...

I get the error:

ml_client.create_or_update(job) Exception: [31m Error: 1) One or more fields are invalid Details: Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': ...

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:78, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     76 span_impl_type = settings.tracing_implementation()
     77 if span_impl_type is None:
---> 78     return func(*args, **kwargs)
     80 # Merge span is parameter is set, but only if no explicit parent are passed
     81 if merge_span and not passed_in_parent:

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:333, in monitor_with_telemetry_mixin.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    331 dimensions = {**parameter_dimensions, **(custom_dimensions or {})}
    332 with log_activity(logger, activity_name or f.__name__, activity_type, dimensions) as activityLogger:
--> 333     return_value = f(*args, **kwargs)
    334     if not parameter_dimensions:
    335         # collect from return if no dimensions from parameter
    336         activityLogger.activity_info.update(_collect_from_return_value(return_value))

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py:563, in JobOperations.create_or_update(self, job, description, compute, tags, experiment_name, skip_validation, **kwargs)
    561 except Exception as ex:  # pylint: disable=broad-except
    562     if isinstance(ex, (ValidationException, SchemaValidationError)):
--> 563         log_and_raise_error(ex)
    564     else:
    565         raise ex

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_exception_helper.py:277, in log_and_raise_error(error, debug, yaml_operation)
    274 else:
    275     raise error
--> 277 raise Exception(formatted_error)

Exception: 

Error: 

1) One or more fields are invalid

Details: 

Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': None, 'type': 'mltable', 'is_anonymous': False, 'auto_increment_version': False, 'name': 'Diabetes data asset', 'description': 'Diabetes dataset as a data asset', 'tags': {'format': 'CSV'}, 'properties': {'v1_type': 'tabular'},

Feb 17 '23 09:02 corticalstack

Mode should be eval_mount

Jul 04 '23 16:07 samuel100

#please-close

Jul 04 '23 16:07 samuel100

azure-docs azure-docs copied to clipboard

Documentation doesn't work for consuming V1 tabular datasets on V2 jobs

Document Details

azure-docs
azure-docs copied to clipboard