AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

feat(iffy): Add initial iffy moderation

Open Bentlybro opened this issue 8 months ago β€’ 20 comments

This is the initial code for the iffy moderation setup.

Changes πŸ—οΈ

In backend/executor/manager.py I have had to make it so add_execution is synchronous but calls an async implementation because we need to perform content moderation with send_to_iffy, which is an async function, if there is a better way to do this please let me know, Then further down we collect all the blocks and the info and send it all to iffy

In backend/util/iffy.py this is where we have the actual send_to_iffy function, this gets all of the data and formats it + gets the other data we need like username and userid + adds the metadata like graphExecutionId ect, then combines it all together into a payload to be sent to iffy

In backend/server/v2/iffy/routes.py i have added @iffy_router.post("/webhook") to receive the returned info from iffy's webhook, then we split it into two.

handle_record_event deals with flagged blocks, if any are flagged it gets the graph_exec_id from the metadata, then calls execution_manager.cancel_execution(graph_exec_id) to stop the graph from running.

and handle_user_event, currently we dont have any systems setup yet to deal with suspending/banning a user so for now i just log the data and return 200 to prevent iffy from sending the data again. The next TODO: would be to setup a system to actually use this info so we can suspend/ban users

I have added a fallback / backup incase iffy fails to reply/if iffy is down in backend/util/openrouter.py we are just using OpenRouter and making it pick any model and just asking it if it things the content is safe or not, if flagged it stops the graph from running. TODO: Improve this and the prompting down the line

In backend/server/rest_api.py i am just adding the router for /api/iffy

This current setup does work but there are some issues that i have been trying to fix and i feel some other eyes on this may help.

  • Sometimes the executor is faster than iffy, so this means the executor runs before iffy has had time to process and reply, I tried to make it check the blocks before we start the executor but i was having issues getting all the block data so if any one has any suggestion's for that please let me know,
  • Currently we are only checking the blocks at the start of the execution, we should probably check the blocks during execution too because i setup a graph that uses pastebin to get data that should be flagged, but its missed because we only check at the start, i tried to add this but we currently dont have a way to "pause"/"slow down" a execution so the execution would run and finish before iffy has checked it all.
  • One thing I noticed is if a graph has multiple blocks that are flagged, it will try to stop the execution multiple times based on how many flagged blocks there are, I need to fix this and make it only try to stop the execution once

List of TODO's down the line/in other Pr's:

  • Add a system to deal with suspending/banning users
  • improve the prompting for the OpenRouter fallback if needed.
  • Fix multiple flagged blocks trying to cancel the same running graph.

Once we are happy with this setup we will need to setup Iffy on GCP.

Link to Iffy's github and also big shout out to @s3ththompson and the Iffy team for several updates to iffy during the setup of this which has helped so much. Id also love your feedback on this if you have any thoughts or suggestions.

Checklist πŸ“‹

This may be difficult to test locally for other people just because of how i have iffy setup, i am hosting a version of it on my server, i can give the api keys/url's and so on just contact me, tho you may need to use nginx? to make it so iffy can call back to your local machine for its webhook replies, For now ill leave the test plan empty because of this

For code changes:

  • [x] I have clearly listed my changes in the PR description
  • [ ] I have made a test plan
  • [ ] I have tested my changes according to the test plan:
    • [ ]

Bentlybro avatar Mar 12 '25 11:03 Bentlybro

Deploy Preview for auto-gpt-docs-dev ready!

Name Link
Latest commit a7cf7e352de79dd68773edefbc9c5adf989679b9
Latest deploy log https://app.netlify.com/projects/auto-gpt-docs-dev/deploys/686cd976bba08300082bdfae
Deploy Preview https://deploy-preview-9621--auto-gpt-docs-dev.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Mar 12 '25 11:03 netlify[bot]

Here's the code health analysis summary for commits 5ff6d2c..a7cf7e3. View details on DeepSourceΒ β†—.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource JavaScript LogoJavaScriptβœ…Β SuccessView CheckΒ β†—
DeepSource Python LogoPythonβœ…Β Success
❗ 41 occurences introduced
🎯 12 occurences resolved
View CheckΒ β†—

πŸ’‘ If you’re a repository administrator, you can configure the quality gates from the settings.

deepsource-io[bot] avatar Mar 12 '25 11:03 deepsource-io[bot]

Deploy Preview for auto-gpt-docs ready!

Name Link
Latest commit a7cf7e352de79dd68773edefbc9c5adf989679b9
Latest deploy log https://app.netlify.com/projects/auto-gpt-docs/deploys/686cd9768c23c00008f49927
Deploy Preview https://deploy-preview-9621--auto-gpt-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Mar 12 '25 11:03 netlify[bot]

The PR appears to have several issues that need addressing: 1) The test plan section is completely empty when it should document testing steps for such a critical security feature 2) The PR template is only partially filled out - missing configuration checklist items which may be relevant given new environment variables being added 3) The PR acknowledges known issues with race conditions and incomplete moderation coverage that could lead to unsafe content getting through. While the core functionality seems solid, these gaps should be addressed before merging.

AutoGPT-Agent avatar Mar 12 '25 11:03 AutoGPT-Agent

The PR follows most guidelines but has some issues that need addressing: 1) The test plan is completely empty and marked as skipped rather than explaining why testing may be difficult, 2) While they state configuration changes are needed for Iffy/GCP, the configuration checklist section is missing entirely rather than being filled out to indicate needed changes to .env.example etc.

AutoGPT-Agent avatar Mar 12 '25 11:03 AutoGPT-Agent

The PR is well documented with clear changes and explanations, but has some issues to address: 1) The test plan is incomplete and they acknowledge testing will be difficult for others, 2) There are known execution timing issues that need to be resolved, 3) Missing runtime block checking capability. However, the code changes themselves properly handle user_id validation and have good security considerations built in with the fallback to OpenRouter when Iffy fails. The scope of changes is focused on content moderation features.

AutoGPT-Agent avatar Mar 12 '25 11:03 AutoGPT-Agent

The PR appears to be well documented with clear changes and explanations. The author has acknowledged some current limitations and areas for improvement. However, there are a few issues that need to be addressed: 1) The test plan is empty and marked as difficult to test locally 2) The author notes some timing issues with iffy vs executor that need to be resolved 3) The PR includes significant new functionality but does not have proper test coverage plans. While these are concerns, they appear to be actively acknowledged and discussed rather than ignored.

AutoGPT-Agent avatar Mar 12 '25 11:03 AutoGPT-Agent

While this PR shows good progress on implementing content moderation, there are a few issues that need to be addressed:

  1. The PR template is only partially filled out - missing test plan
  2. The changes include significant new code but don't clearly document how user_id is protected/validated in data routes
  3. The code has some recognized issues detailed in the description that should be fixed before merging (execution timing, checking blocks during execution, multiple stop attempts)

AutoGPT-Agent avatar Mar 12 '25 11:03 AutoGPT-Agent

The PR appears to be well documented and follows most guidelines, but has a few issues that need addressing: 1) It's missing a proper test plan in the checklist which is required, even if there are deployment complexities 2) The issue of blocks potentially being executed before moderation is complete seems like a significant security risk that should be addressed before merging.

AutoGPT-Agent avatar Mar 14 '25 09:03 AutoGPT-Agent

While the PR is well documented and follows most rules, there are a few issues: 1) The test plan section is explicitly left empty which goes against the requirement of having a complete checklist. 2) The user acknowledges testing challenges but should still document how they tested it themselves. 3) The title follows conventional commit format correctly with feat(iffy). 4) All changes to data/*.py files are properly handling user_id and access controls. 5) The PR description thoroughly explains the changes and their rationale.

AutoGPT-Agent avatar Mar 18 '25 10:03 AutoGPT-Agent

The PR appears to be well-documented and follows most guidelines, but has a few notable issues that should be addressed: 1) The test plan section is empty with acknowledgment it's hard to test, which while honest isn't ideal 2) The PR template is included but not properly filled out with configuration changes which appear to be needed given the iffy integration 3) Configuration changes seem needed but .env.example is not updated according to checklist

AutoGPT-Agent avatar Mar 18 '25 17:03 AutoGPT-Agent

While the PR has made a strong effort to document changes and follows many best practices, there are some key concerns: 1) The test plan section is empty with acknowledgement it's difficult to test, which is problematic for a security feature 2) There are several self-identified issues in the implementation itself noted in the PR description including race conditions and incomplete handling of flagged content 3) The change impacts security and content moderation but doesn't have a complete implementation for handling detected issues 4) The changes use both user_id and graph_exec_id correctly where needed

AutoGPT-Agent avatar Mar 18 '25 20:03 AutoGPT-Agent

While the PR has good documentation and clear changes, there are a few issues that need to be addressed: 1) The PR template checklist is incomplete with test plan missing and not marked as checked. 2) The multiple execution cancellations for flagged blocks is a known bug that should be fixed before merging. 3) The issue with executor being faster than iffy responses needs to be resolved for reliable content moderation. However, the core functionality seems well implemented with proper user_id handling and good documentation of the changes.

AutoGPT-Agent avatar Mar 18 '25 20:03 AutoGPT-Agent

This PR appears to be well-documented and follows the core rules, but has some issues that should be addressed. The scope is correctly labeled in the title as feat(iffy). The changes are clearly documented. The PR template is mostly filled out appropriately though missing test plan, but this is explained due to the complex setup required. The user_id is being properly passed and checked in the new code. However, there are some concerning aspects around race conditions and timing issues that the author has identified need addressing.

AutoGPT-Agent avatar Mar 20 '25 13:03 AutoGPT-Agent

Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

PR Reviewer Guide πŸ”

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 πŸ”΅πŸ”΅πŸ”΅πŸ”΅βšͺ
πŸ§ͺΒ No relevant tests
πŸ”’Β Security concerns

Sensitive information exposure:
The code logs user information in multiple places (e.g., line 123 in iffy/service.py) which could potentially expose PII in logs. Additionally, the webhook verification in routes.py (lines 16-24) is critical - if the HMAC verification fails or the webhook secret is compromised, malicious actors could potentially send fake moderation results to manipulate the system.

⚑ Recommended focus areas for review

Synchronous Blocking

The implementation makes add_execution synchronous by calling run_and_wait on an async function, which could block the main thread while waiting for content moderation. Consider using a fully asynchronous approach.

) -> GraphExecutionEntry:
    """Add a graph execution to the queue"""
    return self.run_and_wait(
        self._add_execution_async(graph_id, data, user_id, graph_version, preset_id)
    )
Error Handling

The function falls back to OpenRouter moderation on failure but doesn't properly handle the case where both services fail. The error message is returned but execution might continue.

except Exception as e:
    logger.error(f"Error in IffyService: {str(e)}", exc_info=True)
    # Fall back to OpenRouter as a last resort
    try:
        is_safe, reason = await open_router_moderate_content(json.dumps(block_content.get('input_data', {}), indent=2))
        return is_safe, f"IffyService error, OpenRouter fallback: {reason}"
    except Exception as e2:
        logger.error(f"Both moderation services failed: {str(e2)}", exc_info=True)
        return False, f"All moderation services failed: {str(e)}, {str(e2)}" 
Resource Management

The Prisma connection is opened and closed for each user data request, which could be inefficient for high-volume operations. Consider reusing connections or implementing connection pooling.

    if not prisma.is_connected():
        await prisma.connect()

    user = await get_user_by_id(user_id)
    if user:
        user_data.update({
            "email": user.email or user_data["email"],
            "name": user.name,
            "username": user.username if hasattr(user, 'username') else user.name,
        })
except Exception as e:
    logger.warning(f"Failed to get user details for {user_id}: {str(e)}")
finally:
    # Ensure we disconnect from Prisma if we connected
    try:
        if prisma.is_connected():
            await prisma.disconnect()
    except Exception as e:
        logger.warning(f"Error disconnecting from Prisma: {str(e)}")

qodo-code-review[bot] avatar Mar 21 '25 11:03 qodo-code-review[bot]

The PR appears to be well-documented with a thorough explanation of the changes and their purpose. The author acknowledges there are some timing issues to resolve but has documented them clearly. Required paths in data/*.py are properly checking user_id. The scope is properly defined in the title as feat(iffy). The only issue is that the test plan is not filled out, though the author explains why (setup complexity) and offers to help others test.

AutoGPT-Agent avatar Mar 24 '25 14:03 AutoGPT-Agent

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Mar 28 '25 12:03 github-actions[bot]

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Mar 28 '25 16:03 github-actions[bot]

Before we merge this we will want to get Iffy hosted on GCP and fully setup, then add the envs to the backend, then we can merge this

Bentlybro avatar Apr 02 '25 14:04 Bentlybro

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Apr 11 '25 14:04 github-actions[bot]

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Jun 03 '25 17:06 github-actions[bot]

Thank you for this detailed PR implementing the Iffy moderation system! The code looks well-structured and your explanation of the functionality is clear and thorough.

What looks good

  • The implementation covers both the Iffy integration and a fallback to OpenRouter when needed
  • The webhook handling is well-structured
  • Security considerations are properly addressed
  • The code is well-documented with clear explanations

Issues to address

Required for approval:

  • The checklist in the PR description isn't complete. While you've explained the testing challenges, you should still provide a basic test plan for what should be tested, even if reviewers need to contact you for credentials.

Suggestions for improvement:

  • You've identified several issues in your PR description that could be addressed:

    1. The race condition where the executor runs before Iffy completes
    2. Only checking blocks at the start of execution
    3. Multiple cancellation attempts for the same execution

    At minimum, the third issue seems straightforward to fix and would improve the implementation.

  • Consider adding some logging or metrics to track moderation performance and results

Next steps

  1. Complete the checklist with a test plan
  2. Consider addressing at least the multiple cancellation issue before merging

Overall, this is a solid implementation that will be valuable for content moderation in the platform!

AutoGPT-Agent avatar Jun 03 '25 17:06 AutoGPT-Agent

Thank you for implementing Iffy moderation for the backend! The code looks well-structured with a good approach of having both a primary moderation system and a fallback using OpenRouter.

Issues to Address

Incomplete Checklist:

  • The PR checklist is incomplete. Only one item is checked off.
  • Even though you've explained the testing challenges, you still need to complete the checklist, especially the test plan section.
  • Please document the specific tests you conducted to verify your implementation works correctly.

Test Plan Requirements:

  • Since this may be difficult for others to test locally due to your Iffy setup, please provide:
    • The specific steps you took to test this functionality
    • What scenarios were verified (e.g., content flagging, webhook processing)
    • Any specific observations or results from your testing

Positive Aspects

  • Good implementation of the HMAC validation for webhooks
  • Clear documentation of the future TODOs
  • Well-thought-out fallback system with OpenRouter
  • Comprehensive error handling throughout the code

Once you've completed the checklist with a proper test plan, this PR will be in better shape for approval. If you need assistance setting up a testable environment for reviewers, please let us know.

AutoGPT-Agent avatar Jun 04 '25 10:06 AutoGPT-Agent

Thank you for implementing this important content moderation feature! The code implementation looks solid with good error handling, logging, and a fallback mechanism to OpenRouter.

Before this can be merged, please address these items:

PR Format Issues

  1. Title Format: Your PR title uses feat(backend/iffy) but our conventional commit format requires using one of the approved scopes. Please update to feat(backend): Add initial Iffy moderation.

  2. Incomplete Checklist: The checklist in your PR description is incomplete. Either:

    • Complete the test plan and check all the boxes, or
    • Remove the test plan section entirely if it's not applicable for this PR

Technical Feedback

  • The implementation of the HMACValidator for webhook security is well-designed
  • Good job handling the fallback to OpenRouter when Iffy isn't available
  • Your clearly identified future improvements shows good forward thinking

Suggestions

  • Consider adding a more detailed guide for how other developers can test this functionality locally
  • For the issue where executor is faster than Iffy, perhaps implementing a configurable delay or queue mechanism might help
  • For multiple blocks being flagged causing multiple cancellation attempts, consider adding a cache or status check before attempting to cancel

Once you've addressed the PR format issues, this will be ready for another review. Great work on this important security feature!

AutoGPT-Agent avatar Jun 11 '25 13:06 AutoGPT-Agent

Thank you for this comprehensive PR adding Iffy moderation to the AutoGPT platform! The implementation looks thorough with both the main Iffy integration and an OpenRouter fallback.

What looks good

  • The PR description is detailed and explains each file change well
  • The implementation includes proper error handling and fallback mechanisms
  • The code is well-structured with appropriate separation of concerns
  • I like the approach of having a secondary moderation method through OpenRouter if Iffy fails

Changes needed before merging

Test Plan Required:

  • Your PR needs a complete test plan in the checklist section
  • Even if it's difficult for others to test locally, please provide a structured test plan of how you've verified the functionality works
  • This could include steps like:
    • Verifying content moderation with known-good inputs
    • Verifying content moderation with known-bad inputs
    • Testing the fallback to OpenRouter when Iffy is unavailable
    • Testing the webhook handler with different event types

Additional suggestions:

  • You mentioned in the PR description that there are issues with the executor sometimes being faster than Iffy. Consider documenting this more explicitly as a known limitation in code comments
  • The issue of multiple flagged blocks trying to cancel the same execution could be addressed with a simple check if the execution is already cancelled before attempting to cancel again

Once you add a proper test plan to your PR, this should be ready for review again. The implementation itself looks solid and well-thought-out.

AutoGPT-Agent avatar Jun 11 '25 13:06 AutoGPT-Agent

Thank you for this comprehensive PR implementing Iffy moderation! Your code looks well-structured with good error handling and fallback mechanisms to OpenRouter.

Your PR description does an excellent job explaining the changes and implementation challenges. I appreciate the detailed explanation of how the code works across multiple files and the clear TODO list for future work.

However, I cannot approve this PR yet because:

  1. The checklist in the PR description is incomplete. You've only marked that you've clearly listed your changes, but left the test plan empty. Even though you've explained that testing would be difficult for others due to your specific Iffy setup, our process requires a complete checklist. Please either:
    • Complete the test plan with steps you've used to verify your changes work
    • OR explicitly note that the test plan section is not applicable for this PR

Some suggestions to improve the implementation:

  1. The race condition you mentioned where the executor runs before Iffy has time to process content is concerning. Consider adding more detailed comments about this in the code and potential future solutions.

  2. For the issue with multiple flagged blocks trying to cancel the same execution multiple times, you could add a simple check to prevent duplicate cancellations.

  3. Consider adding more detailed logging throughout the moderation flow to help with debugging.

Please update the PR description with a complete checklist, and we can proceed with the review.

AutoGPT-Agent avatar Jun 16 '25 14:06 AutoGPT-Agent

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Jun 17 '25 09:06 github-actions[bot]

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Jul 08 '25 08:07 github-actions[bot]

Thank you for your detailed PR adding Iffy moderation capabilities to the backend. The implementation looks well thought out with proper fallback to OpenRouter when Iffy is unavailable.

Issues to address before merging:

  1. Incomplete checklist: While you've explained why local testing is difficult, our process requires a completed checklist. Even if others can't easily test locally, you should:

    • Provide a test plan that you used to verify your changes
    • Mark that you've tested according to that plan
    • If testing requires special setup, document those requirements clearly in the test plan
  2. Testing documentation: Since this is a critical security feature (content moderation), please provide more details on:

    • How you verified the moderation system works correctly
    • What types of content you tested with
    • How you confirmed the webhook handling works as expected
  3. Multiple flags issue: You mentioned that if multiple blocks are flagged, it will try to stop execution multiple times. This should be fixed before merging, as it could lead to unexpected behavior.

  4. Timing issue: You also mentioned that sometimes the executor is faster than Iffy, causing execution to run before Iffy has time to process. This seems like a critical issue that should be addressed.

The changes themselves look solid, but I recommend completing the checklist with a detailed test plan that you've followed to verify the functionality, even if others can't easily replicate it locally. This helps document how the feature was tested and provides a reference for future changes.

AutoGPT-Agent avatar Jul 08 '25 08:07 AutoGPT-Agent

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Jul 25 '25 13:07 github-actions[bot]