| SmolHub Website | Data‑driven personal research portfolio |
Date:
🤖 Yuvraj Singh’s AI Portfolio
🎓 Computer Science Engineering Student at IIIT Bhubaneswar (2023-2027)
🔬 Research Focus: NLP, Computer Vision, and Multimodal LLMs
🚀 Mission: Building AI from scratch and bridging research with practical implementation
🌟 What This Portfolio Showcases
This comprehensive portfolio website demonstrates my journey in AI/ML through hands-on implementations, research, and practical applications. Every model, dataset, and experiment represents hours of learning, coding, and understanding the fundamentals of artificial intelligence.
🧠 Core Philosophy
- From Scratch Implementation: Building AI models without relying on pre-built libraries to truly understand the mathematics and algorithms
- Paper-to-Code: Translating cutting-edge research papers into working implementations
- Educational Focus: Creating resources that help others learn AI/ML concepts
- Open Source: All implementations are freely available for the community
🎯 Key Features
🔬 From Scratch AI Models (38+ Implementations)
Location: /models/Source: Paper-Replications Repository
A comprehensive collection of AI models implemented from scratch, covering:
🤖 Large Language Models & NLP
- Transformer Architecture - The foundation of modern NLP
- BERT - Bidirectional Encoder Representations from Transformers
- GPT - Generative Pre-trained Transformer
- Llama - Meta’s efficient language model architecture
- Llama4 - Advanced multi-expert architecture with MoE
- Kimi-K2 - DeepSeek V3 inspired model with advanced features
- Gemma & Gemma3 - Google’s efficient language models
- DeepSeekV3 - Latest in reasoning capabilities
- Differential Transformer - Novel attention mechanisms
- Attention Mechanisms - Core building blocks of modern AI
- Encoder-Decoder - Seq2seq architectures
- Seq2Seq - Sequence-to-sequence learning
- Fine Tuning using PEFT - Parameter-efficient fine-tuning
- LoRA - Low-Rank Adaptation techniques
- DPO & ORPO - Advanced alignment techniques
- SimplePO - Simplified preference optimization
🎨 Computer Vision & Generative Models
- Vision Transformer (ViT) - Transformers for image classification
- CLIP & CLiP - Vision-language understanding
- CLAP - Contrastive Language-Audio Pre-training
- SigLip - Sigmoid loss for language-image pre-training
- LLaVA - Large Language and Vision Assistant
- PaliGemma - Vision-language model
- Generative Adversarial Networks:
- DCGANs - Deep Convolutional GANs
- CGANs - Conditional GANs
- CycleGANs - Unpaired image-to-image translation
- WGANs - Wasserstein GANs
- Pix2Pix - Paired image-to-image translation
- VAE - Variational Autoencoders
🧮 Fundamental Architectures
- RNNs - Recurrent Neural Networks
- LSTM - Long Short-Term Memory networks
- GRU - Gated Recurrent Units
- Mixtral - Mixture of Experts architecture
🎵 Audio & Speech
- Whisper - Speech recognition and transcription
- TTS - Text-to-speech synthesis
- Moonshine - Audio processing models
⚡ Optimization & Training
- DDP - Distributed Data Parallel training
🎮 SmolHub Playground
Location: /smolhub/Purpose: Experimental AI playground
An interactive space for:
- Proof-of-Concept Models - Testing new ideas quickly
- Educational Experiments - Learning-focused implementations
- Rapid Prototyping - Fast iteration on AI concepts
- Community Contributions - Collaborative learning space
📊 Curated Datasets
Location: /datasets/Count: 5+ High-quality datasets
Featured Datasets:
- ImageNet-Mini - 10K images, 100 classes for computer vision
- SentimentFlow - Advanced sentiment analysis dataset
- CodeSense - Programming language understanding
- Anomaly Hunter - Anomaly detection scenarios
- Multilingual QA - Cross-lingual question answering
Each dataset includes:
- ✅ Preprocessing Scripts - Ready-to-use data pipelines
- ✅ Documentation - Comprehensive usage guides
- ✅ Benchmarks - Performance baselines
- ✅ Visualization Tools - Data exploration utilities
🛠️ Technical Implementation
Backend Architecture
- Framework: Jekyll with GitHub Pages compatibility
- Dynamic Updates: Automated model/dataset synchronization
- Performance: Optimized for fast loading and mobile responsiveness
- SEO: Structured data and meta optimization
AI Model Integration
- Automatic Discovery: GitHub API integration for real-time updates
- Code Highlighting: Syntax highlighting for multiple programming languages
- Documentation: Auto-generated documentation from README files
- Version Control: Git-based versioning for all implementations
Deployment
- Platform: Render.com with automatic deployments
- CI/CD: GitHub Actions for continuous integration
- ZeroGPU: Hf ZeroGPU support for hosting my models as spaces!
📈 Impact & Learning Outcomes
Educational Value
- 📚 Deep Understanding: Every model built from mathematical foundations
- 🔍 Research Translation: Converting papers into working code
- 🎯 Practical Application: Real-world implementation experience
- 🤝 Community Learning: Open-source contributions for others
Technical Skills Demonstrated
- Programming: Python, PyTorch, TensorFlow, JavaScript, Ruby
- Mathematics: Linear Algebra, Calculus, Statistics, Probability
- Algorithms: Deep Learning, Machine Learning, Computer Vision, NLP
- Software Engineering: Version Control, Testing, Documentation, Deployment
Research Engagement
- 📑 Paper Implementation: 38+ research papers translated to code
- 🔬 Experimental Design: Systematic approach to model development
- 📊 Performance Analysis: Benchmarking and optimization
- 📝 Documentation: Comprehensive technical writing
🚀 Getting Started
Explore the Portfolio
- Visit the Website: Yuvraj’s Portfolio
- Browse Models: Check out
/models/for from-scratch implementations - Try SmolHub: Experiment in
/smolhub/playground - Download Datasets: Access curated data in
/datasets/
For Developers
# Clone the repository
git clone https://github.com/YuvrajSingh-mist/yuvraj-singh-portfolio.github.io.git
# Install dependencies
bundle install
# Run locally
bundle exec jekyll serve
# Visit localhost:4000
For AI Researchers
- 🔗 Source Code: Paper-Replications Repository
- 📖 Documentation: Each model includes comprehensive README
- 🤝 Collaboration: Issues and pull requests welcome
- 📧 Contact: Reach out for research discussions
🎯 Current Focus & Future Goals
Current Work
- 🔬 Fine-tuning LLMs - Advanced optimization techniques
- 📚 GAN Research - Exploring generative adversarial networks
- 🤖 Multimodal Models - Vision-language understanding
- 📊 Model Optimization - Efficiency and performance improvements
Seeking Opportunities
- 🎯 Research Internships - NLP and Computer Vision roles
- 💼 Full-time Positions - RE/RS roles in AI/ML
- 🤝 Collaborations - Open to research partnerships
- 🎓 Mentoring - Helping others start their AI journey
🤝 Connect & Collaborate
### Professional Links
- Email: yuvraj.mist@gmail.com
- X: https://x.com/YuvrajS9886
Contributing
This portfolio is open-source and welcomes contributions:
- 🐛 Bug Reports: Issues and suggestions
- 💡 Feature Requests: New ideas and improvements
- 🤝 Code Contributions: Pull requests welcome
- 📚 Documentation: Help improve explanations
🏆 Recognition & Stats
- 🤖 38+ AI Models implemented from scratch
- 📊 5+ Datasets curated and documented
- 🌟 Open Source - All code freely available
- 🎓 Educational Impact - Helping others learn AI
- 🚀 Active Development - Continuously updated
📜 License
This portfolio and associated code are released under the MIT License. Feel free to use, modify, and distribute with appropriate attribution.