ocean icon indicating copy to clipboard operation
ocean copied to clipboard

[Docs] Improve documentation around building a new Ocean integration

Open lordsarcastic opened this issue 8 months ago • 3 comments

User description

Description

What - Revamp the documentation around building a new integration

Why - This gives user a much clearer explanation and guidance on developing an Ocean integration, handholding them through a demo creating an integration from scratch and helps avoid common errors and issues around creating an Ocean integtrations

How - In this documentation guide, we walk users through creating their own integration (using the Jira integration) from scratch following best practices and philosophies adopted in Ocean integrations..

Type of change

Please leave one option from the following and delete the rest:

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] New Integration (non-breaking change which adds a new integration)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] Non-breaking change (fix of existing functionality that will not change current behavior)
  • [ ] Documentation (added/updated documentation)

All tests should be run against the port production environment(using a testing org).

Core testing checklist

  • [ ] Integration able to create all default resources from scratch
  • [ ] Resync finishes successfully
  • [ ] Resync able to create entities
  • [ ] Resync able to update entities
  • [ ] Resync able to detect and delete entities
  • [ ] Scheduled resync able to abort existing resync and start a new one
  • [ ] Tested with at least 2 integrations from scratch
  • [ ] Tested with Kafka and Polling event listeners
  • [ ] Tested deletion of entities that don't pass the selector

Integration testing checklist

  • [ ] Integration able to create all default resources from scratch
  • [ ] Resync able to create entities
  • [ ] Resync able to update entities
  • [ ] Resync able to detect and delete entities
  • [ ] Resync finishes successfully
  • [ ] If new resource kind is added or updated in the integration, add example raw data, mapping and expected result to the examples folder in the integration directory.
  • [ ] If resource kind is updated, run the integration with the example data and check if the expected result is achieved
  • [ ] If new resource kind is added or updated, validate that live-events for that resource are working as expected
  • [ ] Docs PR link here

Preflight checklist

  • [ ] Handled rate limiting
  • [ ] Handled pagination
  • [ ] Implemented the code in async
  • [ ] Support Multi account

Screenshots

Include screenshots from your environment showing how the resources of the integration will look.

API Documentation

Provide links to the API documentation used for this integration.


PR Type

Documentation


Description

  • Revamped documentation for building Ocean integrations.

  • Added detailed guides for creating Jira integration.

  • Introduced step-by-step instructions for configuration, testing, and publishing.

  • Enhanced clarity on integration structure and webhook implementation.


Changes walkthrough 📝

Relevant files
Documentation
11 files
_category_.json
Updated category label for integration documentation.       
+1/-1     
develop-an-integration.md
Revised introduction and steps for integration development.
+4/-4     
defining-configuration-files.md
Added guide for defining configuration files in integrations.
+318/-0 
getting-started.md
Simplified and refocused getting started guide.                   
+14/-174
implementing-an-api-client.md
Added detailed guide for implementing an API client.         
+705/-0 
implementing-webhooks.md
Added guide for implementing webhooks in integrations.     
+224/-0 
installing-ocean-cli-and-scaffolding-an-integration.md
Added guide for installing Ocean CLI and scaffolding integrations.
+186/-0 
integration-configuration-and-kinds-in-ocean.md
Added guide for configuring integration kinds in Ocean.   
+246/-0 
publishing-your-integration.md
Added guide for publishing integrations to the Ocean repository.
+57/-0   
sending-data-to-port-using-resync-functions.md
Added guide for sending data to Port using resync functions.
+258/-0 
testing-the-integration.md
Added guide for testing integrations locally.                       
+151/-0 

Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • lordsarcastic avatar Mar 26 '25 13:03 lordsarcastic

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Code Consistency

    The JiraClient implementation contains a method refresh_request_auth_creds that is defined but never used in the code examples. This could confuse readers trying to implement the integration.

    def refresh_request_auth_creds(self, request: httpx.Request) -> httpx.Request:
        return next(self._get_bearer().auth_flow(request))
    
    Missing Import

    The main.py file references a UserWebhookProcessor that is added as a webhook processor, but this class is not imported at the top of the file and was not defined in the previous sections.

    ocean.add_webhook_processor("/webhook", UserWebhookProcessor)
    

    qodo-code-review[bot] avatar Mar 26 '25 13:03 qodo-code-review[bot]

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Retry after rate limiting

    The rate limit handler is called after logging the error, but the function then
    raises the exception without retrying the request. This means rate-limited
    requests will always fail. Modify the code to retry the request after handling
    the rate limit.

    docs/framework-guides/docs/getting-started/implementing-an-api-client.md [187-215]

     async def _send_api_request(
         self,
         method: str,
         url: str,
         params: dict[str, Any] | None = None,
         json: dict[str, Any] | None = None,
         headers: dict[str, str] | None = None,
     ) -> Any:
         try:
             async with self._semaphore:
                 response = await self.client.request(
                     method=method,
                     url=url,
                     params=params,
                     json=json,
                     headers=headers
                 )
                 response.raise_for_status()
                 return response.json()
         except httpx.HTTPStatusError as e:
    -        # If we hit a 429, handle it
    -        await self._handle_rate_limit(e.response)
    +        if e.response.status_code == 429:
    +            # If we hit a 429, handle it and retry
    +            await self._handle_rate_limit(e.response)
    +            return await self._send_api_request(method, url, params, json, headers)
             logger.error(
                 f"Jira API request failed with status {e.response.status_code}: {method} {url}"
             )
             raise
         except httpx.RequestError as e:
             logger.error(f"Failed to connect to Jira API: {method} {url} - {str(e)}")
             raise
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 9

    __

    Why: The suggestion fixes a critical issue where rate-limited requests always fail because the code doesn't retry after handling the rate limit. This change significantly improves the reliability of the API client when dealing with rate limits.

    High
    Remove undefined webhook processor

    The code references a UserWebhookProcessor that hasn't been defined or imported
    in the provided code. This will cause a runtime error when the integration
    starts.

    docs/framework-guides/docs/getting-started/sending-data-to-port-using-resync-functions.md [223-225]

     ocean.add_webhook_processor("/webhook", IssueWebhookProcessor)
     ocean.add_webhook_processor("/webhook", ProjectWebhookProcessor)
    -ocean.add_webhook_processor("/webhook", UserWebhookProcessor)
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 9

    __

    Why: The code references a UserWebhookProcessor that hasn't been defined or imported anywhere in the PR. This would cause a runtime error when the integration starts, as the class doesn't exist.

    High
    Fix infinite loop risk

    The pagination logic doesn't handle empty responses correctly. If the API
    returns an empty list but has a 'total' value greater than zero, the loop will
    continue indefinitely. Add a check to break the loop if no items are returned,
    regardless of the 'total' value.

    docs/framework-guides/docs/getting-started/implementing-an-api-client.md [242-265]

     async def _get_paginated_data(
         self,
         url: str,
         extract_key: str | None = None,
         initial_params: dict[str, Any] | None = None,
     ) -> AsyncGenerator[list[dict[str, Any]], None]:
         params = initial_params or {}
         params |= self._generate_base_req_params()
     
         start_at = 0
         while True:
             params["startAt"] = start_at
             response_data = await self._send_api_request("GET", url, params=params)
             items = response_data.get(extract_key, []) if extract_key else response_data
     
             if not items:
                 break
     
             yield items
             start_at += len(items)
     
    -        # Stop if we've reached the total
    -        if "total" in response_data and start_at >= response_data["total"]:
    +        # Stop if we've reached the total or if no items were returned
    +        if "total" in response_data and (start_at >= response_data["total"] or len(items) == 0):
                 break
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion addresses a critical bug where the pagination logic could enter an infinite loop if the API returns empty items but still has a 'total' value greater than zero. This is a significant issue that could cause the integration to hang.

    Medium
    Handle missing header safely

    The rate limit handler doesn't check if the 'Retry-After' header exists before
    accessing it. If Jira returns a 429 status without this header, the code will
    raise a KeyError. Add a check to handle this case safely.

    docs/framework-guides/docs/getting-started/implementing-an-api-client.md [180-185]

     async def _handle_rate_limit(self, response: Response) -> None:
         if response.status_code == 429:
    +        retry_after = response.headers.get("Retry-After", "60")
             logger.warning(
    -            f"Jira API rate limit reached. Waiting for {response.headers['Retry-After']} seconds."
    +            f"Jira API rate limit reached. Waiting for {retry_after} seconds."
             )
    -        await asyncio.sleep(int(response.headers["Retry-After"]))
    +        await asyncio.sleep(int(retry_after))
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 7

    __

    Why: This suggestion fixes a potential KeyError exception when the 'Retry-After' header is missing from a 429 response. Adding a fallback value improves error handling and prevents crashes during rate limiting scenarios.

    Medium
    • [ ] Update

    qodo-code-review[bot] avatar Mar 26 '25 13:03 qodo-code-review[bot]

    This pull request is automatically being deployed by Amplify Hosting (learn more).

    Access this pull request here: https://pr-1518.d1ftd8v2gowp8w.amplifyapp.com