gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

[Improvement] Inconsistencies Between Python and Java APIs

Open AndreVale69 opened this issue 10 months ago • 5 comments

What would you like to be improved?

While working on issue #6577, I noticed (thanks to @yuqi1129, PR #6581) that there are some inconsistencies between APIs written on Python and Java. I think that this inconsistency can lead to confusion and errors.

I would like to suggest to adjust the signature functions written on Python. Here is an approximate list of incosisntecy that I found:

  • On the gravitino client you can see the create_catalog method: https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-python/gravitino/client/gravitino_client.py#L77-L87 https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-java/src/main/java/org/apache/gravitino/client/GravitinoClient.java#L118-L127 Or the same with load_catalog: https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-python/gravitino/client/gravitino_client.py#L74-L75 https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-java/src/main/java/org/apache/gravitino/client/GravitinoClient.java#L113-L116
  • On metalake you can see the same problem: https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-python/gravitino/client/gravitino_metalake.py#L118-L125 https://github.com/apache/gravitino/blob/873c6afeb3b0a5ebdb6d64b1d279442ac5e3115c/clients/client-java/src/main/java/org/apache/gravitino/client/GravitinoMetalake.java#L215-L222

I'm a new contributor to this repository (I only made a one-line commit), so if you have any suggestions, I'm ready to hear them!

How should we improve?

I suggest adapting the function signatures in the Python client to match those in the Java client. I think this will improve consistency and reduce potential bugs (for example, issue #6577 is born out of this inconsistency).

I would like to take up this issue and work on resolving it.

AndreVale69 avatar Mar 04 '25 10:03 AndreVale69

If it is okay for you, I will work on this issue. Can you please assign me? Since it is not a short-term task.

Solving this issue will close #6577 (and this one), and close PR #6581 as unmerged.

I tagged you @justinmclean because I believe you're one of the moderators. I hope this isn't a problem, I'm a new contributor to this repo. Thanks for your help and time!

AndreVale69 avatar Mar 06 '25 18:03 AndreVale69

hi @AndreVale69 sure, I assgined this to you. Thank you for your interest Gravitino!

xunliu avatar Apr 09 '25 00:04 xunliu

hi @AndreVale69 sure, I assgined this to you. Thank you for your interest Gravitino!

Hi @xunliu, and thanks for your response!

While working on the issue #6581, @yuqi1129 proposed a refactoring. However, this work has not been started because @jerryshao mentioned:

https://github.com/apache/gravitino/pull/6581#issuecomment-2705436483 As for the name consistency between Java and Python, since we already use catalog_type as a parameter name in Python, the change will break the compatibility, it is not worth doing for now.

Starting work on this isn't a problem in itself, but it would introduce several breaking changes.

Finally, based on previous issues, it seems that a "divide et impera" approach works better (and makes reviews easier for you). So this issue will serve as the parent, and I will create child issues accordingly.

AndreVale69 avatar Apr 09 '25 04:04 AndreVale69

Please do not break the compatibility when you change the method signature, I think this is the most important thing to me.

jerryshao avatar Apr 09 '25 16:04 jerryshao

Please do not break the compatibility when you change the method signature, I think this is the most important thing to me.

Maybe this is a stupid question, but how can I change a function signature without breaking backward compatibility? I understand that changing the order of parameters would definitely break things, but even just renaming a parameter would affect keyword-style calls (e.g., foo(old_name=...)).


Edit: As a solution to maintain backward compatibility, I will introduce a compatibility wrapper that maps deprecated keyword arguments (e.g., name) to their updated names (e.g., catalogName) without breaking existing calls.

To encourage users to migrate to the new API, the wrapper will emit warnings when deprecated parameters are used.

This compatibility logic will be applied only to public API functions, and not to internal implementation details.

This approach is inspired by practices used in libraries like TensorFlow: deprecated_args method.

An example: map_fn.py - line 42

AndreVale69 avatar Apr 09 '25 17:04 AndreVale69