AI Engineer · IIIT Nagpur 2027

Vishal Kumar

Building real AI systems, not just demos

I build production-oriented AI systems across multi-agent orchestration, RAG, LLM fine-tuning, Production ML, and streaming inference. My strongest work combines research discipline with deployable software.

View selected work GitHub LinkedIn

#1 / 823

AMD Slingshot AI Promptathon

Top 1.2%

HackerRank Orchestrate, 12,885 participants

Top 2.4%

Wunder Fund LOB Predictorum

0.9987 QWK

Clinical AI validation metric

5 live

Deployed AI/product systems

Selected work

Systems with measurable behavior.

The strongest projects lead with the problem, the engineering mechanism, and the proof. This is the order recruiters, founders, and research teams need.

CrisisOps

Agents + RL

A procedural SRE training environment for cooperative LLM agents handling cascading microservice failures.

1000 rollouts, 95.1% parse rate, reward gain from 0.404 to 0.445.

SystemPrimary and Buddy agent pair with tool use, schemas, automated judging, and reward shaping.

WhyShows multi-agent systems, environment design, RL-style feedback loops, and deployment depth.

LangGraph OpenEnv GRPO QLoRA FastAPI

Live demo GitHub

Triagegeist

Clinical ML

Emergency triage prediction system with hierarchical safety logic and clinical feature engineering.

OOF QWK 0.9987, ESI-1 recall 98.6%, 80K patient records.

SystemSafety guardrail, 200+ engineered clinical features, and LightGBM/CatBoost/XGBoost ensemble.

WhySignals research discipline: validation, bias awareness, high-stakes error handling, and metric literacy.

XGBoost LightGBM CatBoost Production ML

GitHub

LOB Predictorum

Streaming ML

Limit order book prediction system rebuilt after diagnosing batch-to-live degradation.

Top 2.4% globally, 0.96ms inference, 73x model size reduction.

SystemStreaming-native BiGRU + TimeMixer ensemble, InfoNCE SSL pretraining, and INT8 ONNX quantization.

WhyStrong founder signal: found the real failure mode, pivoted architecture, optimized for deployment constraints.

PyTorch BiGRU TimeMixer ONNX

GitHub

WasteWatch AI

Civic AI

AI-native civic waste reporting platform with vision, fallback routing, voice input, and city-level workflows.

Live deployment, bilingual voice, 12 integrations, 100+ city routing.

SystemGemini vision pipeline, Groq fallback, FastAPI services, voice interface, and operational API integrations.

WhyShows product instincts: users, deployment, external APIs, reliability concerns, and civic impact.

Gemini Groq FastAPI Voice AI

Live GitHub

CivicPath

RAG + Voice

Non-partisan election assistant for Indian voters with multilingual guidance and ECI-aligned fact checks.

Google Cloud Run deployment, 10 Google services, 12+ Indian languages.

SystemGemini, Maps, Speech, localized retrieval, voice accessibility, and civic journey generation.

WhySignals grounded assistant design for high-trust public information workflows.

Gemini GCP Maps Speech

Live GitHub

DevBoost AI

Dev tools

Agentic code analysis tool with explain, test, and fix modes for developer workflows.

Streaming responses, 12+ languages, structured output contracts.

SystemReact/Vite interface, FastAPI backend, file upload, export, session history, and deterministic analysis modes.

WhyShows practical AI UX, prompt contracts, and workflow-focused product execution.

React Vite FastAPI LangChain

GitHub

Proof timeline

Competition signals that compound.

The important story is not one lucky ranking. It is repeated performance across prompt engineering, agents, clinical ML, and quantitative streaming inference.

May 2026

HackerRank Orchestrate

Built a grounded support triage agent using only the provided corpus, with evidence retrieval and auditable outputs.

Top 1.2% (#159 / 12,885)

2026

AMD Slingshot AI Promptathon

National prompt engineering and AI system design competition at REVA University, Bengaluru.

#1 / 823

2026

Wunder Fund LOB Predictorum

Quantitative AI challenge focused on limit order book prediction under live streaming constraints.

Top 2.4% (#120 / 4,917)

2026

ArtPark CodeForge - IISc Bangalore

Built under hackathon constraints with focus on useful AI product execution.

1st runner-up

2026

Triagegeist Clinical AI

Clinical ML challenge with hierarchical modeling and high-stakes triage metrics.

OOF QWK 0.9987

Capabilities

A stack around applied AI systems.

The stack is grouped by the problems it solves. This reads stronger than a flat wall of tools.

Agents and LLM systems

Agent orchestration, retrieval, tool use, prompt contracts, evaluation, grounding, and model fallback.

LangGraph LangChain RAG Gemini Claude Groq

Modeling and research

Clinical ML, time-series modeling, self-supervised learning, fine-tuning, ranking metrics, and ablations.

Python PyTorch XGBoost LightGBM QLoRA ONNX

Production engineering

APIs, frontends, deployment, integrations, streaming inference, and pragmatic reliability tradeoffs.

FastAPI React Next.js Docker GCP AWS

Additional builds

Range without diluting the core story.

These are useful secondary signals, but the primary narrative remains production AI systems.

SLM Signal Pod

Phi-2 QLoRA, NIFTY options data, deterministic orchestration, walk-forward evaluation, and RAG ablation.

View repo

CUDA DTW

CPU vs CUDA Dynamic Time Warping on Bitcoin time-series data, showing low-level parallelization depth.

View repo

AI Gatekeeper

Satirical AI evaluation product with structured scoring and sharp product writing. Useful personality signal.

View live

About

Research depth plus shipping discipline.

I am a Fourth-year B.Tech CSE student at IIIT Nagpur. My work sits at the intersection of agentic AI, RAG, reinforcement learning, Production ML, and production engineering.

The pattern across my projects is deliberate: build a system, expose the failure modes, measure behavior, and deploy a usable version. That is why my strongest work includes live systems, competition rankings, documented model choices, and clear evaluation metrics.

I am looking for AI engineering internships, applied research roles, and founder-led teams where the work demands both technical depth and product judgment.

Build serious AI systems.

Open to internships, applied research collaborations, and teams working on agents, RAG, LLM evaluation, clinical AI, or production ML infrastructure.

Email GitHub LinkedIn Kaggle