indexify
indexify copied to clipboard
Versioning for Graphs and Functions in Python SDK
Introduce Versioning for Graphs and Functions
Issue Description
Currently, the Indexify Python SDK lacks a robust versioning system for graphs and functions. This makes it challenging to manage changes over time, track the evolution of workflows, and ensure reproducibility of results. Implementing a versioning system will significantly improve the maintainability and reliability of Indexify workflows.
Current Limitations
- In
indexify/functions_sdk/graph.py, theGraphclass doesn't have any version information:
class Graph:
def __init__(
self, name: str, start_node: IndexifyFunction, description: Optional[str] = None
):
self.name = name
self.description = description
self.nodes: Dict[str, Union[IndexifyFunction, IndexifyRouter]] = {}
# ...
- The
indexify_functiondecorator inindexify/functions_sdk/indexify_functions.pydoesn't include version information:
def indexify_function(
name: Optional[str] = None,
description: Optional[str] = "",
image: Optional[Image] = DEFAULT_IMAGE,
accumulate: Optional[Type[BaseModel]] = None,
payload_encoder: Optional[str] = "cloudpickle",
placement_constraints: List[PlacementConstraints] = [],
):
# ...
- When registering a compute graph in
indexify/remote_client.py, there's no version handling:
def register_compute_graph(self, graph: Graph):
graph_metadata = graph.definition()
serialized_code = graph.serialize()
response = self._post(
f"namespaces/{self.namespace}/compute_graphs",
files={"code": serialized_code},
data={"compute_graph": graph_metadata.model_dump_json(exclude_none=True)},
)
# ...
Benefits of Versioning
- Reproducibility: Ensure that workflows can be reproduced exactly, even as individual functions or the overall graph structure evolves.
- Change Tracking: Easily track changes to functions and graphs over time, facilitating debugging and auditing.
- Collaboration: Enable multiple team members to work on the same workflow without conflicts.
- Rollback Capability: Quickly revert to previous versions of functions or entire graphs if issues are discovered.
- A/B Testing: Compare different versions of workflows or functions to optimize performance.
Proposed Solution
- Add version information to the
Graphclass - Modify the
indexify_functiondecorator to include version information - Update the
register_compute_graphmethod to handle versioning - Implement version comparison and management utilities
- Update the
LocalClientandRemoteClientclasses to support versioning operations - Modify the
Taskclass inindexify/executor/api_objects.pyto include version information: - Update all relevant tests to include version checks
Versioning is done automatically when code changes by the server. Take a look at https://github.com/tensorlakeai/indexify/blob/main/python-sdk/tests/test_graph_update.py
@diptanu i did see that but while the test_graph_update.py file does demonstrate a basic form of updating a graph, it doesn't provide a comprehensive versioning system as described in the issue.
Users don't have direct control over versioning through the SDK - which is okay and makes sense why its like that but its also a bit limiting since they can't specify version numbers. without explicit versioning, users might find it challenging to manage complex workflows or collaborate effectively, especially in larger teams.
More importantly there seems to be no way to roll back to previous versions, or manage multiple versions simultaneously as pointed out in the issue. SDK doesn't provide methods for users to query or inspect different versions of a graph or function.