LLM Alchemy Chamber π§ββοΈβ¨
Welcome to a friendly neighborhood repository featuring diverse experiments and adventures in the world of LLMs. This collection is no ordinary repository; it's an alchemical blend of scripts, notebooks, and experiments dedicated to the mystical realm of Language Models (LLMs).
Alchemical Scripts
| Projects |
GitHub Link |
Colab Link |
Blog Link |
Description |
| Youtube Cloner |
Folder |
Fireship GPT |
Blog coming soon |
An Attempt at cloning youtubers using LLMs by Finetuning |
| Data Prep |
GitHub Link |
Colab Link |
Description |
| Documents -> Dataset |
GitHub |
Colab |
Given Documents generate Instruction/QA dataset for finetuning LLMs |
| Topic -> Dataset |
GitHub |
Colab |
Given a Topic generate a dataset to finetune LLMs |
| Alpaca Dataset Generation |
GitHub |
Colab |
The original implementation of generating instruction dataset followed in the alpaca paper |
Repo Structure
βββ DataPrep (Notebook to generate synthetic data)
β βββ dataset_prep.ipynb
β βββ ...
βββ Deployment (TGI/VLLM scripts for testing)
β βββ ...
βββ Finetuning (Finalized Finetuning Scripts)
β βββ Gemma_finetuning_notebook.ipynb
β βββ Llama2_finetuning_notebook.ipynb
β βββ Mistral_finetuning_notebook.ipynb
β βββ Mixtral_finetuning_notebook.ipynb
β βββ ...
βββ LLMS (LLM experiments)
β βββ ambari
β β βββ ...
β βββ CodeLLama
β β βββ ...
β βββ Gemma
β β βββ finetune-gemma.ipynb
β β βββ gemma-sft.py
β βββ Llama2
β β βββ ...
β βββ Mistral-7b
β β βββ ...
β βββ Mixtral
β βββ ...
βββ Projects (Upcoming ideas to explore)
β βββ YT_Clones
β βββ Fireship_clone.ipynb
β βββ youtube_channel_scraper.py
β βββ ...
βββ Quantization
β βββ ...
βββ utils
β βββ streaming_inference_hf.ipynb
βββ RAG (Retrieval Augmented Generation)
βββ 1_Naive_RAG.ipynb
βββ 2_Semantic_Chunking_RAG.ipynb
βββ 3_Sentence_Window_Retrieval_RAG.ipynb
βββ 4_Auto_Merging_Retrieval_RAG.ipynb
βββ 5_Agentic_RAG.ipynb
βββ 6_Visual_RAG.ipynb