Improve Data Type Visualization support for ballerina module
Current Limitation
Currently, MI is unaware of Ballerina record data types, making it a blocker for using Ballerina modules and importing Ballerina connectors as MI connectors.
Suggested Improvement
Bring the ballerina data type viewer and record support to the MI VSCode plugin.
Version
No response
We had an initial discussion with the Tooling team to explore the feasibility of reusing the data type visualizer from the BI VSCode extension for the MI extension and saving the data types as ballerina records. During the meeting, it was agreed that saving data types as JSON Schemas is more suitable than using Ballerina records, due to the complexity of type conversions and the additional requirement of having the Ballerina LS dependency to the MI extension.
It was also decided to prepare a design document and UI wireframes to facilitate further discussion on the feature. I am currently working on the design document.
The following is a summary of the design we plan to implement in order to add data type support to the MI VSCode extension.
Proposed solution
-
Data schemas will be stored as JSON Schema files under the project directory: resources/dataTypes/.
-
Each mediator will display both its input and output data schemas, which will be used to support suggestions in the expression editor.
-
Users can leverage the Data Mapper to define or derive the expected input schema for endpoints or connectors, improving clarity and continuity in the integration flow.
-
To support schema tracking, introduce two new attributes for mediators that modify the payload:
- input-payload-schema – represents the data structure before the mediator.
- output-payload-schema – represents the data structure after the mediator.
<payloadFactory output-payload-schema="sample-schema.json">
Initial Data Type Definition
- Creating APIs from OpenAPI definitions/ WSDL files
- Automatically extract request and response schemas from the linked Swagger or WSDL files.
- Use these schemas as the initial data type for the Tryout feature and for downstream mediators.
- At entry points (API resources/ proxy/ inbound endpoints etc)
- When creating an initial entry point, optionally allow the user to define a data type, which will serve as the output payload schema.
- This defined data type will appear in a new "Data Types" tab in the side panel at the start node. Within this tab, users should be able to edit or delete and create a new data type. Only one data type can exist at a time for a given start node.
- If a data type is already defined in the Data Types tab, the Tryout tab can automatically generate a default payload using placeholder values for each key. The user can then edit this payload and save it with a sample name.
- If the data type is not defined:
- When the user defines a payload manually in the Tryout tab, provide an option to save the derived schema as a new data type.
Mediation Flow Data Types
In the Data Types tab of the side panel, users can either manually define the response schema or infer it from the Tryout feature. Depending on the mediator, the input and output data types may be either read-only or editable.
- For mediators like Variable or Payload, the input payload is inherited from the previous mediator and is therefore read-only.
- For mediators that do not modify the payload, such as Variable or Log, the output payload remains the same as the input and is also read-only.
- For mediators that modify the payload—such as Payload Factory, XSLT, etc.—the output payload can either be:
- Manually defined by the user, or
- Inferred from the Tryout feature based on the response structure.
Connector modules
MI Connectors and Ballerina connectors:
- For connector modules, the input and output schemas are defined. These schemas should be linked to the mediation flow to enable suggestions in the expression editor.
HTTP requests
- If a Swagger definition is provided, extract input and output schemas directly and show them in the connection.
- If Swagger is not available:
- Allow the user to define an input schema manually.
- Use the response from a Tryout run to infer and populate the output schema.
In both scenarios above, the expected input schema for the connector and the payload from the previous mediator may not match.
To address this:
- When a payload mismatch is detected, provide an option to insert a Data Mapper between the mediators.
- This allows users to transform the payload to match the expected schema before invoking the connector or HTTP request.
I have started the implementation. First, I will add support for defining a data type for API resources and then use it in the expression editor.
We had a review meeting to further discuss the design and UI component with the MI team and tooling team. In the meeting it was discussed that we need to further analyse and identify use cases for schema representation in Synapse and come up with a language level design on how schemas are represented and used in integrations before moving on to UI development.
Update: Based on the concerns raised during the last meeting, specifically the inconvenience of requiring users to name each schema manually, I explored an alternative approach where schema names are automatically generated. This approach would eliminate the need for user-defined names by generating unique identifiers internally and saving the schemas automatically.
To evaluate its feasibility, I completed a partial implementation and scheduled a discussion to review the potential drawbacks. The main issues identified with this approach were:
- Reference management: If the schema name is derived from the artifact name, renaming the artifact would require updating all references.
- Reusability: Automatically named schemas would not support reusability across artifacts.
- Using random UUIDs as names negatively affects readability and manageability.
Following the discussion, we concluded that the original approach, allowing users to name schemas and reuse them across artifacts, is more suitable. I've already created initial wireframes to support this design and completed part of the implementation to enable schema saving during API creation. I will proceed with extending this functionality to other artifact types as well.
Progress update: I have done the VSCode side implementations for creating schemas, generating schemas with swagger definitions, and side panel update for displaying the payloads of each mediator. Now I am working on the related LS side implementations.