🎬 ai-clips-maker

Created by Alperen Sümeroğlu — An AI-native video engine that turns long-form content into short, viral-ready clips with surgical precision.

ai-clips-maker is a smart, modular Python tool built for creators, educators, and developers. It transcribes speech, detects speakers, analyzes scenes, and crops around the key moments — creating ready-to-share vertical clips for TikTok, Reels, and Shorts with zero manual editing.

📚 Contents

📦 Features
🛠 Installation
🚀 Quickstart
🔍 How It Works
⚙️ Tech Stack
🎯 Use Cases
🧪 Tests
🗺 Roadmap
🤝 Contribute
👤 Author
🎧 Weekly Rewind Podcast
📄 License

📦 Features

🎞️ Auto-segment videos based on speech & scene shifts
🧠 Word-level transcription using WhisperX
🗣️ Speaker diarization (who spoke when) via Pyannote
🪄 Face/body-aware cropping focused on active speaker
📐 Output formats: 9:16 (vertical), 1:1 (square), 16:9 (wide)
🔌 Modular and easily extensible pipeline

🛠 Installation

# Install main package
pip install ai-clips-maker

# Install WhisperX from source
pip install git+https://github.com/m-bain/whisperx.git

# Install dependencies
# macOS
brew install libmagic ffmpeg

# Ubuntu/Debian
sudo apt install libmagic1 ffmpeg

🚀 Quickstart

from ai_clips_maker import Transcriber, ClipFinder, resize

# Step 1: Transcription
transcriber = Transcriber()
transcription = transcriber.transcribe(audio_file_path="/path/to/video.mp4")

# Step 2: Clip detection
clip_finder = ClipFinder()
clips = clip_finder.find_clips(transcription=transcription)
print(clips[0].start_time, clips[0].end_time)

# Step 3: Cropping & resizing
crops = resize(
    video_file_path="/path/to/video.mp4",
    pyannote_auth_token="your_huggingface_token",
    aspect_ratio=(9, 16)
)
print(crops.segments)

🔍 How It Works

🎧 Extracts audio from video
✍️ Transcribes speech using WhisperX
🧍 Identifies speakers with Pyannote
🎬 Detects scene changes & speaker shifts
🎯 Crops video around active speaker’s position
📤 Exports clips in desired format

⚙️ Tech Stack

🔧 Module	🧠 Technology	💡 Purpose
Transcription	WhisperX	Word-level speech-to-text with timestamps
Diarization	Pyannote.audio	Speaker segmentation (who spoke when)
Video Processing	OpenCV, PyAV	Frame-by-frame video control
Scene Detection	Scenedetect	Detects shot boundaries
ML Inference	PyTorch	Powering WhisperX & Pyannote models
Data Handling	NumPy, Pandas	Transcription & clip structuring
Media Utilities	ffmpeg, libmagic	Media decoding + type detection
Testing Framework	pytest	End-to-end and unit testing support

All tools were selected for speed, flexibility, and production-grade stability.

🎯 Use Cases

🎙 Podcasters clipping episodes into shareable highlights
📚 Teachers summarizing lecture content
📱 Social media teams repurposing YouTube for Reels
🧠 Developers automating video workflows
🚀 Startups building AI-based content tools

🧪 Tests

# Run test suite
pytest tests/

Covers all components: transcriber, diarizer, clip detector, resizer.

🗺 Roadmap

Status	Feature	Note
✅	Core pipeline: Transcribe → Diarize → Detect	Implemented in v1.0
✅	Speaker-aware video cropping	Production ready
🚧	Multi-language subtitle generation	Planned for Q2 2025
📌	Auto-caption overlay	In design phase
🧪	Web UI (upload + preview clips)	Prototype in progress
🧠	HuggingFace or Streamlit live demo	On backlog

🤝 Contribute

We welcome pull requests, ideas, and feedback.

# Fork the repo
git clone https://github.com/alperensumeroglu/ai-clips-maker.git
cd ai-clips-maker

# Create feature branch
git checkout -b feat/your-feature

# Make changes, commit, and push
git commit -am "Add feature"
git push origin feat/your-feature

Before contributing, please review open issues and coding style guide.

👤 Author

Alperen Sümeroğlu
Computer Engineer • Entrepreneur • World Explorer 🌍
15+ European countries explored ✈️

🔗 LinkedIn
🧠 LeetCode
🚀 Daily.dev

“Let your code tell your story — clean, powerful, and useful.”

🎧 Weekly Rewind Podcast

🎤 Weekly insights on AI, tech, and building globally — by Alperen Sümeroğlu.

🚀 What does it take to grow as a Computer Engineering student, build projects, and explore global innovation?

This API is part of a bigger journey I share in Weekly Rewind — my real-time documentary podcast series, where I reflect weekly on coding breakthroughs, innovation insights, startup stories, and lessons from around the world.

💡 What is Weekly Rewind?

A behind-the-scenes look at real-world experiences, global insights, and hands-on learning. Each episode includes:

🔹 Inside My Coding & Engineering Projects
🔹 Startup Ideas & Entrepreneurial Lessons
🔹 Trends in Tech & AI
🔹 Innovation from 15+ Countries
🔹 Guest Conversations with Builders & Engineers
🔹 Productivity, Learning & Growth Strategies

🎧 Listen now:

“True learning isn’t in tutorials — it’s in building, exploring, and reflecting.”

ai-clips-maker
ai-clips-maker copied to clipboard

Metadata

🎬 ai-clips-maker

📚 Contents

📦 Features

🛠 Installation

🚀 Quickstart

🔍 How It Works

⚙️ Tech Stack

🎯 Use Cases

🧪 Tests

🗺 Roadmap

🤝 Contribute

👤 Author

🎧 Weekly Rewind Podcast

💡 What is Weekly Rewind?

📄 License

← Metadata

Owner

Metadata

ai-clips-maker ai-clips-maker copied to clipboard

Metadata

🎬 ai-clips-maker

📚 Contents

📦 Features

🛠 Installation

🚀 Quickstart

🔍 How It Works

⚙️ Tech Stack

🎯 Use Cases

🧪 Tests

🗺 Roadmap

🤝 Contribute

👤 Author

🎧 Weekly Rewind Podcast

💡 What is Weekly Rewind?

📄 License

← Metadata

Owner

Metadata

ai-clips-maker
ai-clips-maker copied to clipboard