llm_engineering fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors

Summary

Fixes critical issue where OpenAI fine-tuning jobs fail during post-training safety evaluations (refusals_v3) due to sensitive content in product descriptions.

Problem

When running the Week 6 Day 5 fine-tuning exercise, users encounter this error:

Error while running moderation eval refusals_v3 for snapshot 
ft:gpt-4o-mini-2024-07-18:personal:pricer:CQxxxx
Error while running eval for category hate/threatening

Root Cause: The Amazon product dataset contains items with sensitive keywords (weapon, knife, tactical, combat, etc.) that trigger OpenAI's post-training safety checks.

Solution

1. Updated Notebook (`day5.ipynb`)

Added check_moderation() function that:

Implements two-stage filtering (keyword pre-filter + OpenAI Moderation API)
Provides detailed reporting of flagged items
Returns clean items ready for fine-tuning

2. Standalone Scripts

fix_moderation.py: Batch filtering script with 25+ sensitive keywords
test_moderation.py: JSONL verification utility

3. Documentation

DAY5_MODERATION_FIX_README.md: PR-focused documentation
MODERATION_FIX_README.md: Technical deep-dive

Results

Before Fix

Training: 200 examples
Validation: 50 examples
Status: ❌ Failed during post-training moderation

After Fix

Training: 190 examples (10 filtered)
Validation: 48 examples (2 filtered)
Status: ✅ Successfully completed and deployed

Testing

Verified with successful fine-tuning job:

Job ID: ftjob-moQGns3ajsS5UWIxxxxx
Model: ft:gpt-4o-mini-2024-07-18:personal:pricer:CQUNxxxx
Confirmed via OpenAI completion email

Benefits

Prevents users from wasting time/money on failed fine-tuning jobs
Demonstrates best practices for OpenAI safety evaluations
Provides reusable tools for content filtering
No breaking changes to original notebook structure

Files Changed

✏️ week6/day5.ipynb - Added moderation function
✨ week6/fix_moderation.py - New filtering script
✨ week6/test_moderation.py - New verification utility
📚 week6/DAY5_MODERATION_FIX_README.md - New documentation
📚 week6/MODERATION_FIX_README.md - New technical docs

Compatibility

✅ Python 3.8+
✅ OpenAI Python SDK v1.0+
✅ No breaking changes
✅ Works with W&B integration

Community Contribution - Week 6 Day 5 Moderation Fix by @bilallamal07

Oct 14 '25 14:10 bilallamal07

Oh gosh - would you be OK to move this to community-contributions folder? I'm grateful to have this change, and I will make this update to the main repo at some point, but in the meantime it's best not to affect the main repo where possible..

Oct 14 '25 15:10 ed-donner

Oh gosh - would you be OK to move this to community-contributions folder? I'm grateful to have this change, and I will make this update to the main repo at some point, but in the meantime it's best not to affect the main repo where possible..

Hi, Ed Thanks for pointing this out. The PR submitted outside the community contribution process was unintentional. I’ll ensure all future updates align with the community contribution guidelines moving forward.

I appreciate your support and guidance! Best regards,

Oct 15 '25 08:10 bilallamal07

fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors

Summary

Problem

Solution

1. Updated Notebook (day5.ipynb)

2. Standalone Scripts

3. Documentation

Results

Before Fix

After Fix

Testing

Benefits

Files Changed

Compatibility

1. Updated Notebook (`day5.ipynb`)