gapic-generator-python icon indicating copy to clipboard operation
gapic-generator-python copied to clipboard

Resource path helper functions collide when resources in different packages share a name

Open parthea opened this issue 3 years ago • 7 comments

During review of https://github.com/googleapis/python-aiplatform/pull/777, @nicain found that function test_dataset_path is defined 3 times in file test_dataset_path.py here, here and here.

parthea avatar Nov 02 '21 10:11 parthea

I remember talking about this with @dizcology before - I believe the cause is multiple resources namedDataset referenced in that service.

  • 2 resources from other APIs (AutoML, Data Labeling) https://github.com/googleapis/googleapis/blob/4e836c7c257e3e20b1de14d470993a2b1f4736a8/google/cloud/aiplatform/v1beta1/migratable_resource.proto#L39-L46
option (google.api.resource_definition) = {
  type: "automl.googleapis.com/Dataset"
  pattern: "projects/{project}/locations/{location}/datasets/{dataset}"
};
option (google.api.resource_definition) = {
  type: "datalabeling.googleapis.com/Dataset"
  pattern: "projects/{project}/datasets/{dataset}"
};
  • 1 defined in AI Platform https://github.com/googleapis/googleapis/blob/8031cc26fe7f1163faf8778323c8762541c0aa6b/google/cloud/aiplatform/v1/dataset.proto#L36-L40
message Dataset {
  option (google.api.resource) = {
    type: "aiplatform.googleapis.com/Dataset"
    pattern: "projects/{project}/locations/{location}/datasets/{dataset}"
  };

I believe this is the only API where this occurs.

@software-dov Is there a good way to prevent these resources from clobbering each other?

busunkim96 avatar Nov 02 '21 20:11 busunkim96

Not only is the testcase triplicated, so are the dataset_path and parse_dataset_path methods (see https://github.com/googleapis/python-aiplatform/issues/897): the generated is capable of reordering them, which is a breaking change.

tseaver avatar Dec 08 '21 10:12 tseaver

@software-dov Is there a good way to prevent these resources from clobbering each other?

I don't know that there's a perfect-yet-backwards-compatible solution. We can definitely sort the resources by name to maintain stable order and easily detect collisions. The two main options I can think of are:

  1. Pick one resource if there's a name collision
  2. Disambiguate names if there's a collision

I'm guessing most people would prefer 2.

software-dov avatar Jan 04 '22 22:01 software-dov

@software-dov Yep I vote for 2.

Can the resources from other APIs be disambiguated with the service name?

  • dataset_path / parse_dataset_path for the aiplatform one
  • automl_dataset_path / parse_automl_dataset_path
  • datalabeling_dataset_path / parse_datalabeling_dataset_path.

Luckily it looks like the aiplatform version of dataset (projects/{project}/locations/{location}/datasets/{dataset}) is the last to be defined in the current published library, so the change would not be breaking.

busunkim96 avatar Jan 14 '22 21:01 busunkim96

@busunkim96 Seems like a good solution. More generally, we can make a map from resource name to a list of resources sorted on proximity (ours vs theirs) then by name.

software-dov avatar Jan 19 '22 20:01 software-dov

@parthea any updates

atulep avatar May 03 '22 21:05 atulep

This work is in progress and planned for the current iteration with a target date of May 13th

parthea avatar May 05 '22 11:05 parthea