Hi there, I am Yuvraj!

Everything transformers!; love to re-implement classic and seminal AI/ML papers from in a clean, beginner-friendly manner.
Focus areas: Distributed Systems for large scale training/inference alongwith RL for pre/post training paradigms
Looking for: Research (academia / industrial) positions in my areas of interest.
Community: Mentor newcomers and collaborate on open-source projects. Built educational libraries demystifying core AI/ML concepts for all people.

Professional Experience

alphaXiv — Research Intern · Oct 2025 – April 2026
- Reimplemented and reproduced results from seminal and recent ML papers (e.g., Attention Is Not All You Need, TRM/HRM) from scratch in PyTorch under the alphaXiv paper-implementations repository.
- Co-authored an article on applying Evolutionary Strategies for fine-tuning LLMs.
- Built evaluation pipelines and benchmarking infrastructure for LLM/VLM systems; evaluated models such as DeepSeek-OCR and OlmOCR2 on OmniDocBench and related OCR benchmarks.
- Deployed large models (e.g., DeepSeek-OCR, OCR2, LLMs like Ouro) using vLLM and Modal/Baseten, and generated a 100k-document OCR dataset from arXiv PDFs for large-scale document understanding research.
IISER, Kolkata — Summer Research Intern · May 2024 – May 2025
- Co-authored (first author) with Prof. Kripabandhu Ghosh, for creating a dataset of 40k scraped YouTube comments, humanely-verified, stance-analysed famous sports controversies (cricket and football) with applied stance detection.
- Utilized LLMs like Llama-3.1/3.2, Mistral-7b, Qwen-2.5 under zero/few-shot prompt to create the dataset. Benchmarked and fine-tuned existing open-sourced LLMs Reasoning (distilled) and Non-Reasoning LLMs on the humanely verified dataset. Submitted to COLM 2025.
University of Maryland — Research Intern · Dec 2024 – Feb 2025
- Worked on UI/UX2Code generation with primary focus on creation of a robust dataset for the VLM models to be fine-tuned upon.
- Scraped 100+ websites of static web pages and collected their commit history and corresponding webpage layouts to create a dataset of 200+ records.

Projects

Paper Replications Paper Code Repository

A comprehensive collection of code implementations replicating results from influential machine learning and deep learning research papers. Features 30+ models including Transformers,...

GitHub

★ 424

◉ —
NeatRL RL Library

Comprehensive implementations of deep RL algorithms including DQN, A2C, PPO, DDPG, TD3, and SAC. Features one-file implementations, experiment tracking with W&B, automatic...

GitHub

★ 225

◉ —
SmolCluster Distributed Computing Library

Educational Library for training/inference of neural networks across heterogenous compute like Mac minis, Raspberry Pi, and GPUs, written using only socket library...

GitHub

★ 74

◉ —
SmolTorrent Distributed Systems

Distributed storage system that shards .safetensors ML checkpoints across a Raspberry Pi cluster coordinated from a macOS master, over TCP with SHA-256...

GitHub

★ 5

◉ —
PlogPayouts Google's Solution Challenge '24

Transform your daily jog into a mission for a cleaner world with PlogPayouts. Our innovative website + app rewards you for collecting...

GitHub

★ 1

◉ —
FarmGenie (GeoHack 2024) Hackathon

Our platform utilizes LLMs and a Mixture of Expert (MoE) approaches to provide precise guidance on soil management, plant disease identification, and...

GitHub

★ 1

◉ —
Movies Review System POC

Introducing the Movie Review System, where AI meets movie magic to revolutionize how viewers experience films. This project goal is to provide...

GitHub

★ 0

◉ —
MoviesMania (Geek-o-thon) Hackathon

Step into the future of entertainment discovery with MoviesMania. The rpoduct aims to simplify your search for the perfect movie or web...

GitHub

★ 0

◉ —
Insight-Ed (HackNITR 5.0) Hackathon

Imagine an online classroom where teachers instantly know when and why students lose focus. Our AI-powered solution bridges the knowledge gap by...

GitHub

★ 0

◉ —
NeatRL Playground Interactive Platform

Beautiful, interactive website showcasing AI-powered games with reinforcement learning agents. Features Pong AI with Deep Q-Learning, real-time WebSocket communication, and smooth animations....

GitHub

★ 0

◉ —

Education

International Institute of Information Technology, Bhubaneswar · 2023–2027
- BTech, Computer Science Engineering

Achievements

GeoHack '24 Finale — 2nd place Project: FarmGenie · IEEE GRSS Kolkata and SAADRI · 2024
YESIST12 '24 (Special Track) — Finalist Project: PlogPayouts · Led the project · 2024
Geek‑o‑thon (D3 @ IIIT‑BH) — Winner
Project: MovieMania · Inter‑college hackathon (AI/ML) · 2023

Yuvraj Singh

Professional Experience

Projects

Education

Achievements