FastChat
FastChat copied to clipboard
Add CSAM and NSFW image moderation and fix Reka logging
trafficstars
Why are these changes needed?
- Add CSAM and NSFW moderation filter. Check the README for how to run. Notably, the NSFW endpoint should be the full endpoint now, not just the domain but including the path.
- Made an additional change to Reka API logging to not log the base64 images by filtering for only the text messages.
Note:
- I think it could be a good idea to combine this filter with
moderation_filterwhich includes the text one. I wanted to have some hierarchy where if the text moderation fails then we should short-circuit and we don't even have to check the image moderation filters. Similarly, if NSFW moderation endpoint fails, we shouldn't have to check CSAM. This is so we can save some API calls. Maybe we can also cache each images' results so we don't have to keep calling the endpoint for multi-turn conversations??