azure-docs icon indicating copy to clipboard operation
azure-docs copied to clipboard

Documentation doesn't work for consuming V1 tabular datasets on V2 jobs

Open thegitofdaniel opened this issue 2 years ago • 2 comments

The SDK2 example for consuming V1 tabular on sdk2 jobs does not work. I think the parameters are no longer up to date. Mode and type can be passed as strings; passing the dataset as path as shown returns an error.

It is also not clear how the .py script that will receive the data should ingest it.


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

thegitofdaniel avatar Dec 14 '22 16:12 thegitofdaniel

@thegitofdaniel

Thanks for your feedback! We will investigate and update as appropriate.

@thegitofdaniel I have assigned this to content author @ynpandey to check and share his valuable insights on this.

Naveenommi-MSFT avatar Dec 15 '22 11:12 Naveenommi-MSFT

Was this resolved, have issues passign an SDK v1 data asset (mltable) to an SDK v2 training job where the input data is specified as follows:

my_job_inputs = {
    "input_data": Input(
            type=AssetTypes.MLTABLE, 
            path=filedataset_asset,
            mode=InputOutputModes.DIRECT
    )
}

job = command(
    inputs=my_job_inputs,
...

I get the error:

ml_client.create_or_update(job) Exception: [31m Error: 1) One or more fields are invalid Details: Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': ...

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:78, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     76 span_impl_type = settings.tracing_implementation()
     77 if span_impl_type is None:
---> 78     return func(*args, **kwargs)
     80 # Merge span is parameter is set, but only if no explicit parent are passed
     81 if merge_span and not passed_in_parent:

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:333, in monitor_with_telemetry_mixin.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    331 dimensions = {**parameter_dimensions, **(custom_dimensions or {})}
    332 with log_activity(logger, activity_name or f.__name__, activity_type, dimensions) as activityLogger:
--> 333     return_value = f(*args, **kwargs)
    334     if not parameter_dimensions:
    335         # collect from return if no dimensions from parameter
    336         activityLogger.activity_info.update(_collect_from_return_value(return_value))

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py:563, in JobOperations.create_or_update(self, job, description, compute, tags, experiment_name, skip_validation, **kwargs)
    561 except Exception as ex:  # pylint: disable=broad-except
    562     if isinstance(ex, (ValidationException, SchemaValidationError)):
--> 563         log_and_raise_error(ex)
    564     else:
    565         raise ex

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_exception_helper.py:277, in log_and_raise_error(error, debug, yaml_operation)
    274 else:
    275     raise error
--> 277 raise Exception(formatted_error)

Exception: 

Error: 

1) One or more fields are invalid

Details: 

Could not parse Data({'skip_validation': False, 'mltable_schema_url': None, 'referenced_uris': None, 'type': 'mltable', 'is_anonymous': False, 'auto_increment_version': False, 'name': 'Diabetes data asset', 'description': 'Diabetes dataset as a data asset', 'tags': {'format': 'CSV'}, 'properties': {'v1_type': 'tabular'}, 

corticalstack avatar Feb 17 '23 09:02 corticalstack

Mode should be eval_mount

samuel100 avatar Jul 04 '23 16:07 samuel100

#please-close

samuel100 avatar Jul 04 '23 16:07 samuel100